Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Getting a pointer to a row from an Index

Status
Not open for further replies.

stanlyn

Programmer
Sep 3, 2003
945
US
Hi,

Is there any way to get a pointer to a row from an index? Or is there a better way to do this. Let me explain...

Table: BOOKS with an INDEX on book_type + book_number + page-number. All these fields are char and padded to fill their fixed sizes.

Other fields include author, ocr_text and date.

Now we do a:
select author, ocr_text, date from books where author='jim' and ocr_text contains 'red oak tree' and between(date, '01/01/2013', '12/15/2013') order by author

... and yes I know some of the syntax is not correct, but anyway I get a result set back that matches the query. Now lets say our result set has:

Deed Book 232 Page 124
Deed Book 436 Page 444
Deed Book 501 Page 123
Mortgage Book 555 Page 741
Mortgage Book 474 Page 654

Note that these results matches what the query asked for... So far no problem as it gives us what we expect.

Our form has buttons "FIRST HIT", "PREV HIT", "NEXT HIT", and "LAST HIT" that traverses the matching query hits in our list above.
Our form also has 2 more buttons "PREV PAGE" and "NEXT PAGE" that simulates turning pages in the book WITH NO RESPECT to the query results. In other words, lets say we first arrive at "Deed Book 232 Page 124" and we press the "PREV PAGE" button, we should be seeing the page immediately to the left(previous) of it. Also note that whatever page we are looking at pressing the "PREV PAGE" will always take us to the previous page, while pressing the "NEXT PAGE" will always take us to the next page without respect of the query results. These next and previous pages we are navigating to may and MAY NOT be in our query results, as it navigates the library in library order without regard to the query results. The index above mentioned is the library order index.

Now, lets say that we want to look at the page IMMEDIATELY BEFORE "Deed Book 232 Page 124 SubPg 1" which is "Deed Book 232 Page 123 SubPg 1" assuming it exists, or it could be "Deed Book 232 Page 99 SubPg 1", if there are no pages between 124 and 99.

There is no way I can think of that allows me to find the prev row in library order from any given place as the only relationship it has is "they are beside each other" in the index and nothing else. I can't see making a primary/foreign key or any other column relationship work in a dynamic environment where new items are constantly coming in, other than reading this library index. So unless someone comes up with something, the only way I see this working is somehow reading the library index for a pointer to the prev and.or next rows.

And yes, I know that SQL Server doesn't have any concept of records. I come from a VFP background where this kind of stuff is easy...

Thanks, Stanley
 
George,

> The real question is... how few inserts can I do before something breaks?

Yes, that is why I said "the big ?..." By using 6, 8, or 12 decimal places for this lib_index field, it can take a lot of inserts before it breaks. Also note that we are only centering the entry between the lookup's high and low values, so this will negate much of the impact you describe above. You ARE seeing this correctly, however in reality, the operator will probably scan a batch of deeds (for talking purposes) from deed book 200, and will move onto a batch of mortgages, and onto another batch of marriages. Each of these batches will be appended to the table's end but scattered about the lib_index. Lets say the scanned page is "Deed book 451 page 47. The decision used to process this is:
1. lookup to see if it already exists and if so, do nothing,
2. if not exists, do this to get the LOW value from the lib_index field:
Select Top 1 * From Page
Where county_id = @county and page_type = @type and Book_Num = @book and page_prefix = @prefix
And cast(Page_Num as int) < cast(@pg as int) And cast(SubPage_Num as int) = cast(@sub as int)
Order BY county_id, page_type, book_num, page_prefix, cast(Page_Num as int) DESC

3. if not exists, next do this to get the HIGH value from the lib_index field:
Select Top 1 * From Page
Where county_id = @county and page_type = @type and Book_Num = @book and page_prefix = @prefix
And cast(Page_Num as int) > cast(@pg as int) And cast(SubPage_Num as int) = cast(@sub as int)
Order BY county_id, page_type, book_num, page_prefix, cast(Page_Num as int) asc

4. Then we add the 2 HIGH and LOW numbers together and divide by 2 which gives us the new lib_index value for the new inserted record.

The quickest way for this to give us a problem would be if the operator scanned many consecutive pages. Wonder if 12 decimal places will allow for several thousand consecutive page inserts per day? Any non-consecutive inserts should not present a problem.

A nightly "reorder/rebuild" of this "lib_index" field is planned that sets each row back to a whole number by removing all the decimals by renumbering the whole table. This field has no other purpose than to allow a user view a page in whatever book and/or page type and by pressing the "PrevPage" button they will be looking at the previous page as if they flipped the page back in a book. By doing it this way the select that gets the prev page is as simple as

"select * from page where lib_index = current lib_index - 1".

When viewing that page, pressing the prev button again does the exact query by flips the page back one more page and so on. All this works without knowing if the prev/next book_number or page type actually exists or if we just navigated into a different page type. It simply navigates all pages in the table (without regard to what book number or page types) using this lib_index field that was assigned its values based on the absolute library order.


Below is some VFP code that is used to create the absolute library order. It takes care of padding "character data" with zeros on the right and padding "numeric data" with zeros to the left.

* This sql version of lib_sort includes system_id and '~' field separators.
dt= document_type
bn= book_num
dp= document_prefix
lcSqlLibSort = Alltrim(system_id)+Replicate('0', 5-Len(Alltrim(system_id))) + '~'+ Alltrim(dt)+Replicate('0', 40-Len(Alltrim(dt))) + '~'+ ;
Iif(Len(Alltrim(bn)) < 1, Replicate('0', 24), Iif(Isdigit(Left(Rtrim(bn),1)) = .T., Replicate('0', 24-Len(Alltrim(bn))) + Alltrim(bn), Alltrim(bn)+Replicate('0', 24-Len(Alltrim(bn)))))+'~'+ ;
Iif(Len(Alltrim(dp)) < 1, Replicate('0', 24), Iif(Isdigit(Left(Rtrim(dp),1)) = .T., Replicate('0', 24-Len(Alltrim(dp))) + Alltrim(dp), Alltrim(dp)+Replicate('0', 24-Len(Alltrim(dp)))))+'~'+ ;
Replicate('0', 12-Len(Alltrim(page_num))) + Alltrim(page_num)+'~'+ ;
Replicate('0', 4-Len(Alltrim(subpage_num))) + Alltrim(subpage_num)

Note that the prefix and subpage_num fields in this script was left out of our discussion for brevity.


> Looks like your combination of CountyId, page_type, book_num, and page_num is really your primary key. More specifically, the combination of these four columns uniquely identifies a row.

YES, I actually created a column that represented all these fields and changed the pri key to this filed, but that didn't help as new inserts wasn't being ordered correctly without using an explicit "order by" clause, so I abandoned that way.

> It also appears (to me) as though the proper ordering for all the documents is CountyId, page_type, book_num and page_num. If you did a simple select, like this....

YES

> I assume that the proper ordering of books would be: 124, 125, 125-A, 125-B, 126. Is this correct?

YES

I do know that the book numbers are mostty numeric, but has some character data in them as well. There are book numbers like "125", "125-B, "DEED INDEX" and so on. Therefore I look at the 1st character and if numeric I pad zeros to the left, otherwise I pad to the right.

I still have to verify the page and subpage fields to see if they all contain legal numbers after conversion, and if so they will be changed to int.


> Don't actually run this
> Select CountyId, page_type, book_num, page_num
> From Page
> Order By CountyId, page_type, book_num, page_num

This is used for querying user specified rows...

Don't confuse this as we've been talking about 2 entirely different topics in this thread. The first one is centered around the 4 buttons that returns specific matching user specified criteria. The second one is centered around the "PrevPage" and "NextPage" buttons that has nothing to do with the returned results the user queryed for, instead they get the pages immediately before or after the presently viewed page and must be able to traverse the whole library.

Thanks, Stanley
 
Here's what I'm thinking....

If you create a column named lib_index, the possibility exists for this number to be problematic. For example, if you always add 1 or subtract 1 to get the next page, then someone viewing the data would not see newly inserted data until the next day. I suppose it's a business decision for "is this good enough". If you use enough digits and are renumbering the data nightly, the concerns about running out of numbers is somewhat mitigated.

If this were my responsibility, I would attempt to get a solution that didn't require nightly maintenance while also allowing people to view newly inserted data as soon as it is inserted.

Currently, I'm thinking that you could create a computed column that makes the sorting and identifying next/previous pages easier. Specifically, I would create a computed column similar to the VFP code you show above. Computed columns are nice because you don't need to maintain them. You can almost think of them as a formula instead of data.

After creating the computed columns, I would then create a non-clustered index that includes all of these columns. A couple things happen when you create an index on a computed columns. The computed data is actually stored in the index and the index can be used to look up data very quickly.

Before creating the computed column, I would create a function that performs the VFP code you show above. Of course, this is not exactly the same code, but it should create similar functionality.

Code:
Create Function dbo.CalculateLibSort(
	@County_Id varchar(5),
	@page_type varchar(40),
	@book_num varchar(24),
	@page_num varchar(20)
	)
Returns varchar(100)
With SchemaBinding
As
Begin
	Return(

		Convert(Char(5), @County_id)
		+ '~'
		+ Convert(Char(40), @page_type)
		+ '~'
		+ Replicate('0', 20 - Len(Convert(VarChar(20), Convert(Int, Left(@book_num, PatIndex('%[^0-9]%', @book_num + 'X')-1)))))
		+ Convert(VarChar(20), Convert(Int, Left(@book_num, PatIndex('%[^0-9]%', @book_num + 'X')-1)))
		+ '~'
		+ Convert(Char(5), Right(@book_num, len(@book_num) - PatIndex('%[^0-9]%', @book_num + 'X')+1))
		+ '~'
		+ Replicate('0', 20-Len(@page_num)) + @page_num
	)
End

Then, create the computed column like this:

Code:
Alter Table Page Add LibSort As (dbo.CalculateLibSort(county_id, page_type, book_num, page_num))

Now you can use it to find the previous page like this:
Code:
Select	Top 1 *
From	Page
Where	LibSort < dbo.CalculateLibSort(@CountyId, @page_type, @book_num, @page_num)
Order BY LibSort

Or the next page like this:
Code:
Select	Top 1 *
From	Page
Where	LibSort > dbo.CalculateLibSort(@CountyId, @page_type, @book_num, @page_num)
Order BY LibSort


-George
Microsoft SQL Server MVP
My Blogs
SQLCop
twitter
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom
 
George,

> someone viewing the data would not see newly inserted data until the next day

YES, you are right if we use the number plus or minus one. So all I need to do is look for the next or previous decimal number and that would always be instantly available for viewing. A main requirement is instant viewability. So, thanks for pointing that out as I had not thought it all the way thru when I said "add or subtract 1" for the next/prev page. Just a small tweak should solve this.


> get a solution that didn't require nightly maintenance

I am open to any suggestions for doing this and so far no suggestions has surfaced.


> also allowing people to view newly inserted data as soon as it is inserted.

YES, this is a must and answered above...


> Now you can use it to find the previous page like this:
> Where LibSort < dbo.CalculateLibSort(@CountyId, @page_type, @book_num, @page_num)

I don't this will traverse the whole table using next or prev commands because they are referencing a page_type and/or book_number and/or page number. Correct me if I'm wrong... The way I hammered out doesn't care those fields in the query as it simple looks at the decimal numbers...


Currently the data is coming in from a VFP app therefore the lib_sort field is receiving the data from the vfp snippet above. I will do as you suggest and create the calculated column so it will be complete free standing and not reliant on VFP.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top