Category: 2-SPL-Application-Development
How to Implement JOIN Operations Between Tables from Different Databases with esProc
When data analysis involves different business systems, cross-database computations become necessary. Of these computations, JOIN between tables are particularly challenging, as many databases lack this capability. Although Java can retrieve data and perform calculations, it is too complex. esProc offers a much simpler solution for cross-database JOIN operations. Data and use cases A vehicle management system (DB_Vehicle) stores information about vehicles and their owners. The simplified structure for the owner_info table is shown below: The primary key owner_id is the ID number of vehicle owners. The simplified structure for the vehicle_master table is as follows: The vin is designated as
How to simplify MongoDB queries with esProc
The native query syntax of MongoDB is quite cumbersome, and simple tasks require long code. Complex calculations are even more difficult to implement, such as: esProc provides a MongoDB API with built-in powerful calculation functions that simplify MongoDB queries. Next, let’s try how to integrate esProc into an application. Download and install esProc first, recommend standard version: https://www.esproc.com/download-esproc Then download the esProc external library to access external data sources such as MongoDB, also from the above address.Extract the zip file of the external library to any directory, such as d:\esProcSTD\extlibStart the esProc IDE, open the menu “Tools ->Options”, find the
How to supplement the missing capabilities of database SQL with esProc
Some database SQL lacks necessary capabilities and usually requires writing large sections of code to indirectly implement similar functions. In some cases, stored procedures are even used to drive architecture changes. Common examples include: generating time series, retaining grouped subsets, dynamic row column conversion, natural numbering, relative position, generating multiple records by sequence and set, cumulative calculation, conditional grouping, cross database calculation, set calculation, sequence calculation, self-association structure, recursive calculation, aligned association, etc. Use the following examples to quickly experience it. Generate time series: The Time field of a certain database table is time, and the time interval is sometimes
How to simplify nested SQL in applications with esProc
In data analysis or reporting applications, various complex SQL statements are sometimes encountered: nested in multiple layers, self-join, dynamically transposing, … Here are some exmaples: From SQL to SPL:Create columns from distinct values of a column From SQL to SPL: Statistics by time window From SQL to SPL: Align the existing data to the corresponding position and fill in any missing data with 0 For the same task, the code of esProc SPL is simpler and easier to understand than SQL. Next, let’s try how to integrate esProc in applications and simplify these complex SQL statements. Download esProc first, recommend
How to Speed Up Conditional Filtering on Enumerated Fields with esProc
Enumerated fields in data tables have a limited number of predefined values. Filtering conditions on the enumerated field f are often expressed as f =v1 or f =v2 or …; f !=v1 and f !=v2 and …; in; or not in. The database has to compare f with n values. When the data table is large, the number of comparisons will be very high, leading to poor performance, and the performance worsens as n increases. If comparisons can be avoided during filtering, performance will naturally improve significantly. esProc’s aligned sequence mechanism can achieve this effect. The following example uses the
How to Speed Up EXISTS in Primary-Subtable Joins with esProc
In databases, EXISTS operations between large primary and sub tables often result in poor performance. Such operations are essentially join operations. If the primary and sub tables are pre-sorted by their primary keys, the ordered merge algorithm can be employed to significantly improve performance. This algorithm requires only sequential traversal of the two tables, eliminating external storage buffering and drastically reducing I/O and computation. esProc supports the ordered merge algorithm, allowing it to transform EXISTS operations on primary and sub tables into ordered merges, thereby significantly improving computational performance. Below, we will use the example of orders and order details
How to Speed Up Associations between Large Primary and Sub Tables with esProc
Associations between primary and sub tables, like orders and order details tables, are quite common. SQL uses JOIN to perform such associations, but performance often suffers significantly when both tables are large. If both the primary and sub tables are pre-sorted by their primary keys, a merge algorithm can be used to perform association. This method requires only sequential traversal of the two tables, eliminating external storage buffering and drastically reducing I/O and computation. esProc supports the ordered merge algorithm, which can greatly improve primary-subtable join performance. To illustrate this, we’ll compare the performance of esProc SPL and MySQL using
How to Speed Up JOIN to Avoid Wide Tables with esProc
In data analysis applications, multi-table JOIN operations within a database often involve complex SQL and exhibit suboptimal JOIN performance. As a result, multiple tables are often joined to a wide table. For example, the orders table and several dimension tables in the figure below are likely to be converted into a wide table: However, wide tables pose many issues: significant data redundancy, non-compliance with normalization requirements (prone to errors), and the need to update the entire wide table when dimension table data changes. Moreover, the computational performance of wide tables isn’t necessarily better than that of multi-table joins. esProc specifically
How to Speed Up COUNT DISTINCT with esProc
The COUNT DISTINCT in SQL has always been relatively slow. De-duplication is essentially a grouping operation that requires retaining all the traversed grouping field values for subsequent comparisons. When the result set is too large, the data must also be written to hard drive for buffering, which leads to poor performance. If the data is first sorted by the de-duplication field, calculating ordered de-duplication will be much simpler, as it only requires saving the field that differs from the previous record during traversal and incrementing the count by 1. This eliminates the need to retain the result set, let alone
How to Dump Database Tables into Files to Speed Up Queries with esProc
Large data volumes or high database loads can both lead to slow database query performance. In these situations, using esProc to export the data and store into files for subsequent computation can significantly improve performance. Data and use cases A MySQL database contains the orders_30m table, which stores historical order data spanning multiple years. The table structure is as follows: Sample data: 1 3001 2023-01-05 701 Smartphone Z 1 699.99 699.99 Credit Card 888 Eighth St, Charlotte, NC Delivered 2 3002 2023-02-10 702 Smart Scale 1 49.99 49.99 PayPal 999 Ninth Ave, Indianapolis, IN Delivered 3 3003 2023-03-15 703 Laptop







