Using CTE (Common Table Expression) to achieve lazy loading and querying infinite hierarchical data structure


July 25, 2023 Program

Using CTE (Common Table Expression) to achieve lazy loading and querying infinite hierarchical data structure
Recording the backend implementation of Lazy loading querying infinite hierarchical data structure.

Foreword

🔗

This article focuses on querying and implementing lazy loading for an infinite hierarchical data structure, which is commonly encountered in social media platforms that offer comment functionalities. Users can reply to specific comments, generating multi-level comments. If the system does not have restrictions, it may lead to an infinite depth of comments. This article records how to query and implement lazy loading for such infinite hierarchical data structure.

Infinite hierarchical data structure

🔗
Field NameDescription
comment_idcomment ID
parent_idparent comment ID, if null, represents the first level.

This structure can store comments and their hierarchical relationships. The comment_id is used to identify each comment, while the parent_id indicates which parent comment the comment is replying to.

Implementing lazy loading

🔗

Lazy loading is implemented on the frontend, while the backend is responsible for sequentially providing corresponding data. By utilizing the OFFSET-FETCH pagination technique, the backend can provide the desired data in the specified order.

e.g.

SQL
 OFFSET 0 ROWS FETCH NEXT 2 ROWS ONLY    

To retrieve data starting from the 0th record and fetching the next two records is equivalent to obtaining the 1st and 2nd records (modifiable as per requirements).

Using CTE to implement reading infinite hierarchical data structure

🔗

When sorting the comments based on the creation time, we can retrieve data in the correct order to resolve the issue of lazy loading. However, a problem arises with the creation time of child comments not always being in chronological order. This means that the oldest comment at the first level might have newer child comments. If we strictly sort based on creation time and return the latest 10 comments, the frontend might receive the newest child comment without getting the corresponding parent comment from the first level, causing display issues on the interface.

To address this issue, the data retrieval approach involves pagination and sorting based on the creation time of the first-level comments. It fetches the latest 10 first-level comments along with all their corresponding child comments. This ensures that there will be no mismatch between child comments and their respective parent comments, eliminating any problems arising from child comments not corresponding to their parent comments.

However, due to the possibility of having an infinite number of child comments, it is necessary to use CTE (Common Table Expression) to ensure that data from each level is accurately retrieved.

Assuming the table name is comment_history

e.g.

SQL
WITH CommentHierarchy AS (
    -- Anchor member definition
    SELECT
        comment_id,
        parent_id,
        message,
        created_time,
        created_user,
        1 AS Level
    FROM
        comment_history
    WHERE
        parent_id IS NULL -- Fetching top-level comments
		and comment_id in (
  select comment_id from comment_history
  where parent_id is null 
  order by created_time desc 
  OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY)
	

    UNION ALL

    -- Recursive member definition
    SELECT
        c.comment_id,
        c.parent_id,
        c.message,
        c.created_time,
        c.created_user,
        ch.Level + 1 AS Level
    FROM
        comment_history c
    INNER JOIN
        CommentHierarchy ch ON c.parent_id = ch.comment_id
)
SELECT 
    comment_id,
    parent_id,
    message,
    created_time,
    created_user,
	Level
FROM
    CommentHierarchy
ORDER BY
    created_time DESC  

Syntax Explanation:

Create the CommentHierarchy, first fetching the original data of comments where the condition is specified.

parent_id IS NULL (It represents the first level.) 

and

comment_id in (
  select comment_id from comment_history
  where parent_id is null 
  order by created_time desc 
  OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY)

In conjunction with lazy loading, fetch the latest 10 comments and create a new column "Level" with a value of 1 (representing first-level comments).


After obtaining the first level, perform a JOIN between the original data and CommentHierarchy, binding the relationship where the parent_id in the original data matches the comment_id in the CommentHierarchy.(Retrieve the second-level child comments using the comment_id from the first level.)

-- Recursive member definition
    SELECT
        c.comment_id,
        c.parent_id,
        c.message,
        c.created_time,
        c.created_user,
        ch.Level + 1 AS Level
    FROM
        comment_history c
    INNER JOIN
        CommentHierarchy ch ON c.parent_id = ch.comment_id

Continuously use the CommentHierarchy's comment_id to determine if there are child comments, obtaining each level of child comments in a recursive manner, and finally merging all the data together.

Conclusion

🔗

With the above approach, you can obtain all child comments of parent comments and label the level of each comment.

comment_idparent_idmessagecreated_timecreated_userLevel
138808E8-5FFA-4F26-927D-3E06654079BC82D25141-29D3-4135-A162-A588B61AA6EF22332023-07-25 11:55:44.443Alvin2
0D46466C-3725-4FBA-9676-C1D364ED0D1682D25141-29D3-4135-A162-A588B61AA6EF11122023-07-25 11:16:55.977OKA2
548439B1-7D6C-4268-88CA-A52570DC540930C7A6F8-4E69-4F3C-836B-04AEBAE46F84test2023-07-25 11:10:43.610David3
30C7A6F8-4E69-4F3C-836B-04AEBAE46F8482D25141-29D3-4135-A162-A588B61AA6EFXD2023-07-25 11:10:17.560Alvin2
82D25141-29D3-4135-A162-A588B61AA6EFNULLso tired2023-07-25 11:09:52.187OKA1
6722E047-5547-4EE7-8959-938A10534858B6C20E72-2F7A-4891-B8A3-301DBBF72CC6cool~2023-07-25 11:02:23.717Alvin2
2281BDB4-7B21-442F-84C5-A0736BE7DD9AB6C20E72-2F7A-4891-B8A3-301DBBF72CC6wow new system2023-07-25 11:02:19.607Alvin2
B6C20E72-2F7A-4891-B8A3-301DBBF72CC6NULLwhat is this?2023-07-25 11:02:09.060JSON1

SQL



Avatar

Alvin

Software engineer, interested in financial knowledge, health concepts, psychology, independent travel, and system design.

Related Posts