SPL Big Data Computing
Big Data
- Data warehouse running on file system
- Lightweight big data processing technology
- Data warehouse without using SQL
- Hadoop/Spark is too heavy, esProc SPL is light
- Why Big Data Platforms Return to SQL?
- Is distributed technology the panacea for big data processing?
- The performance problems of data warehouse and solutions
- How the performance improvement by orders of magnitude happened
- Why batch jobs are so difficult?
- Data warehouse with “no house” performs better than the one with “the house”
- Calculate Client Churn Rate
- With lightweight SPL available, how necessary is MPP?
- Is It Necessary to Use a Specialized In-Memory Database?
- How to implement an efficient logical data warehouse? Try SPL!
- Does stream computing need a framework? SPL may be a better choice
- HTAP database cannot handle HTAP requirements
- The current Lakehouse is like a false proposition
- The Data Lake’s Impossible Triangle
Practice
- SPL practice: space-time collision problem that renders MPP powerless to solve
- SPL practice: solve space-time collision problem of trillion-scale calculations in only three minutes
- SPL practice: high-concurrency account queries
- SPL practice: multi-index calculation in real time
- SPL practice: customer profile
- SPL practice: funnel analysis
- SPL Practice: query massive and flexible structured data
- SPL practice: improve concurrency through route calculation
- SPL practice: implement real-time write, second-level count of daily 10 billion time series data on a single node
- SPL practice: search high-dimensional binary vectors
Data maintenance
- SPL practice: data flow during speeding up batch job
- SPL Practice: migrate computing tasks out of database
- SPL Practice: integerization during data dump
- Routine for regular and active update of small amounts of data
- Routine for regular maintenance of single composite table
- Routine for regular maintenance of multi-zone composite table
- Routine for real-time data appending
- Routine for real-time data updating
- Routine for real-time data appending – avoid small tables by means of memory
User Behavior Analysis
- User Behavior Analysis in Practice 1: Conventional Grouping and Aggregation
- User Behavior Analysis in Practice 2: Redundant Grouping Key Field
- User Behavior Analysis in Practice 3: Order-based Filtering Using Binary Search
- User Behavior Analysis in Practice 4: Using Column-wise Storage
- User Behavior Analysis in Practice 5: Using Dimension Table
- User Behavior Analysis in Practice 6: Numberizing the Dimension Table
- User Behavior Analysis in Practice 7: Dimension Table Filtering
- User Behavior Analysis in Practice 8: The Changing Dimension Table
- User Behavior Analysis in Practice 9: Enumerated Dimension and Tag Dimension
- User Behavior Analysis in Practice 10: Ordered Storage by Account
- User Behavior Analysis in Practice 11: Order-based Grouping
- User Behavior Analysis in Practice 12: Using Pseudo Tables
- User Behavior Analysis in Practice 13: Bi-dimension Ordering
- User Behavior Analysis in Practice 14: Real-time T+0 Analysis
Multidimensional Analysis Server
- Multidimensional Analysis Backend Practice 1: Basic Wide Table
- Multidimensional Analysis Backend Practice 2: Data Type Optimization
- Multidimensional Analysis Backend Practice 3: Dimensions Sorting and Compression
- Multidimensional Analysis Backend Practice 4: Pre-aggregate and Redundant Sorting
- Multidimensional Analysis Backend Practice 5: small fact table associate with small dimension table
- Multidimensional Analysis Backend Practice 6: big fact table associate with small dimension table
- Multidimensional Analysis Backend Practice 7: Boolean Dimension and Binary Dimension
- Multidimensional Analysis Backend Practice 8: Primary-sub Table and Parallel Calculation







