As an example scenario, let's say we have a table that hold folders or areas that other records can be linked to, a document record, for example. Now we want to be able to have as many folders and sub-folders as we like... so we contain these in a table like so:
Use the below to create this table as a temporary table.
create table #folders (id int identity(1,1),FolderName varchar(20),parentID int);Now lets say we have a document assigned to folder ID 8 (Bot2a). We want to be able to show the full path that the folder is contained in, but that information isn't found in the #folders table. We are able to trace it back though.
insert into #folders(FolderName,parentID) values
('Top1',0),
('Top2',0),
('Mid1',1),
('Bot1',3),
('Mid2a',2),
('Mid2b',2),
('NextMid2a',5),
('Bot2a',7);
Looking at my extremely straight arrows, we can see that each folder is assigned a parent folder, with the top level folders having a parentID of 0. So we can see that the full folder path for a document in Bot2a will be "Top2\Mid2a\NextMid2a\Bot2a." So how do we get the path without writing a convoluted while loop? Enter the recursive CTE.
Just like any normal CTE, it will have a query contained in the WITH section, however this time the must consist of two unioned select statements. The first one will be the Anchor statement, containing only the top level records, and the second statement will be the recursive one, excluding the root folders. Here:
with CTE asRunning this will give you the following results:
(
select ID,folderName,parentID,cast(folderName as varchar(250)) as path,1 as PathLevel
from #folders
where parentID = 0
union all
select f.id,f.foldername,f.parentID,cast(cte.path + '\' + f.folderName as varchar(250)),cte.Pathlevel + 1
from #folders f
join cte on cte.ID = f.parentID
where f.ID <> 0
)
select * from CTE;
So you can see each folder is presented with its full path, up to and including itself. Let's process what we just did. In the statement highlighted in orange, we simply queried to return only the top level folders. We also created a Path and PathLevel column. For the anchor records, the full path will be equal to just the folder name, and we'll assign PathLevel a value of one to represent at what level in a path the current folder sits in.
In the recursive statement highlighted in blue, you can see we joined the CTE itself! This is where we link the current record to its parent in the anchor statement. We can then refer to the previous iteration and build on the Path string value, or add a 1 for the level.
Pretty easy. Pretty quick.
Bear in mind that this can be an expensive query with large set of data with large numbers of recursions. Also, if a mistake is made with your joins you could potentially end up in an in infinite loop. Or at least you would if SQL Server didn't enforce a default limit of 100 recursions. This essentially means that (in this example) you can't get more than 100 folders in your path. If you exceed this you will be presented with the following error:
Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
If you want to see this for yourself, simply change the join to "join cte on cte.ID = cte.ID" to create an infinite loop. You can change the default limit though by simply adding the following option at the end of your full statement,"OPTION (MAXRECURSION n)" where of course n is whatever value you choose. Try changing the MAXRECURSION limit to 200 and running the CTE with the infinite loop again.
What other uses for a recursive CTE can you think of?
No comments:
Post a Comment