Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Searching a TStringList of Name Value Pairs

Status
Not open for further replies.

MichaelHooker

Programmer
Mar 17, 2006
70
GB
This one has me baffled. I use Delphi 7 Personal Edition.

The Delphi Help says of IndexOfName:

"Returns the position of the first name-value pair with the specified name.
...
Call IndexOfName to locate the first occurrence of a name-value pair where the name part is equal to the Name parameter or differs only in case. IndexOfName returns the 0-based index of the string. If no string in the list has the indicated name, IndexOfName returns -1.

Note: If there is more than one name-value pair with a name portion matching the Name parameter, IndexOfName returns the position of the first such string."

Note the use of the word "first". The .Values function is supposed to work the same way.

I have written a routine using IndexOfName to pick items out of a name-value list. As an item is matched up the relevant name-value pair is removed from the list, so as to leave a list of un-matched items. It works perfectly when the names are unique but when they are not it does not behave as the Help suggests (which is what I need). IndexOfName is returning the position of the <second> matching name, not the first, and .Values also returns the value of the second instance of the name.

This is the relevant part of the routine, which is called from a for loop cycling through a list of search targets:

Code:
{Target is the string to be matched with a name;
 TempList is the StringList of Name-Value pairs to be searched;
 Position is the position in TempList returned by IndexOfName;
 OutLine is the value part of a match;
 ProcessOutLine is a separate procedure which processes Outline.}
  
  Position := TempList.IndexOfName(Target);
  if Position > - 1 then begin
    OutLine.CommaText := TempList.Values[Target];
    //OutLine.CommaText := TempList.ValueFromIndex[Position];  alternative to line above
    ProcessOutLine;
    TempList.Delete(Position);
  end
  {else... do something else if Position = -1, ie Target wasn't found}

This is a few lines of TempList

PIA786=PIA786,1930,B773,OPRN,27L,DVR4G
PIA788=PIA788,1650,B772,OPKC,27L,DVR4G
QFA10=QFA10,2135,B744,WSSS,27L,BPK6G
QFA2=QFA2,0910,B744,VTBS,27R,BPK6F
QFA2=QFA2,2214,B744,VTBS,27L,BPK6G *
QFA30=QFA30,1220,B744,VHHH,27R,BPK6F
QFA32=QFA32,1219,B744,WSSS,27R,DVR5F
QTR002=QTR002,2113,A346,OTBD,27L,DVR4G
QTR008=QTR008,2152,A333,OTBD,27L,DVR4G
The first search for 'QFA2' as Target returns the second instance, asterisked above. That line is then deleted and the second search for 'QFA2' returns the remaining instance, which happens to be the first one. So they come out the wrong way round. This happens whether or not TempList has been sorted, the natural order and the sorted order both happen to place the target Names in the "right" order. And it all matters because Target is just one field in another CommaText StringList, and the object of the exercise is to combine the two sets of data fields relating to Target.

I guess I can get around this by checking whether or not the name previous to the found name is the same and if so fetching the value of that previous entry in the list, or by sorting the list backwards (by sending the lines in reverse order into a temporary StringList and swapping them back to TempList again). But (a) it's going to slow things down (b) I don't yet know what's going to happen if there are more than two names the same and (c) it shouldn't be necessary, should it?

So my question is: have I completely misunderstood the Help, am I doing something stupid or is it just that Delphi doesn't work as advertised?

Thanks.

Michael Hooker
 
I use Delphi 7 Pro but I would imagine that the VCL TStringList code would be identical in Personal Edition.

I dropped a TButton and a couple of TEdits on a form, added events to create and destroy the TStringList and a button click event. The code worked as the Help would indicate (i.e. it displayed the first occurence).

Although you supplied a lot of information in your question you did not specify exactly how the stringlist was set up, what happens in your ProcessOutline procedure or how you detected that you got the second occurence.

Can you try the following code which works for me in Delphi 7 Pro and let us know if it works in Delphi 7 Personal Edition?
Code:
procedure TForm1.FormCreate(Sender: TObject);
begin
  TempList := TStringList.Create;
  TempList.Add( 'PIA786=PIA786,1930,B773,OPRN,27L,DVR4G' );
  TempList.Add( 'PIA788=PIA788,1650,B772,OPKC,27L,DVR4G' );
  TempList.Add( 'QFA10=QFA10,2135,B744,WSSS,27L,BPK6G' );
  TempList.Add( 'QFA2=QFA2,0910,B744,VTBS,27R,BPK6F' );
  TempList.Add( 'QFA2=QFA2,2214,B744,VTBS,27L,BPK6G' );
  TempList.Add( 'QFA30=QFA30,1220,B744,VHHH,27R,BPK6F' );
  TempList.Add( 'QFA32=QFA32,1219,B744,WSSS,27R,DVR5F' );
  TempList.Add( 'QTR002=QTR002,2113,A346,OTBD,27L,DVR4G' );
  TempList.Add( 'QTR008=QTR008,2152,A333,OTBD,27L,DVR4G' );
end;

procedure TForm1.FormDestroy(Sender: TObject);
begin
  TempList.Free;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  position: integer;
  target: string;
begin
 target := edit1.Text;
  Position := TempList.IndexOfName(Target);
  if Position > - 1 then begin
    Edit2.Text := TempList.Values[Target];
//    Edit2.Text := TempList.ValueFromIndex[Position];
    TempList.Delete(Position);
  end;
end;

Andrew
Hampshire, UK
 
Thanks.

Although you supplied a lot of information in your question you did not specify exactly how the stringlist was set up<<
TempList is declared as a TSTringList in the top level form declarations, created in FormCreate, and destroyed in FormDestroy. The data is imported into it using TempList.LoadFromFile('example.csv') and each line is turned into a name-value pair by taking the first item of data, adding an "=" and then adding this to the beginning of the line. TempList is not set as Sorted, but is sorted after this process. It makes no difference to the outcome if the TempList.Sort instruction is commented out.
>>what happens in your ProcessOutline procedure<<
Nothing relevant. The object of the exercise, as I wrote, is to combine the two sets of data fields relating to Target. That's what it does. Target is an identifying field in a comma-delimited entry in another stringlist (call it List1). ProcessOutLine adds a new comma-delimited entry to a third StringList. This entry consists of some fields from the original Target line in List1 and some fields from the matched-up line in TempList as reflected in OutLine. It does nothing to TempList, it works on OutLine, which is read in the code I provided:
Code:
OutLine.CommaText := TempList.Values[Target];

how you detected that you got the second occurence.
Apart from the fact that the person who uses the program keeps complaining about it? Quite simply by the fact that the first and second occurrences contain recognisably different data. The output of ProcessOutLine for the first instance of Target in List1 is a combination of the data related to that first instance and the second instance of Target in TempList. It works perfectly when Target is unique in both.

Can you try the following code which works for me in Delphi 7 Pro and let us know if it works in Delphi 7 Personal Edition?

Yes, I shall, and I'll post the result. Unless the answer is apparent from what I've just written, I suggest you don't reply until I report back so we don't get confused. Thanks again.

Michael Hooker
 
Andy

Your code worked (as I expected). I then wrote myself a test program and that worked too. I then complicated that test program by including a procedure based on my original ProcessOutline, and it still worked (thus confirming that it's not this procedure which is messing things up). I then included a stage I hadn't mentioned previously (the string Target is in fact a value looked up in yet another name-value list!). And it still worked.

Now all I have to do is fathom out why the code in my real program produces a different result. Obviously <something> is screwy but I can't yet see what is different from the working code in the test programs. I shall have to try creating some test input files and debugging them step by step (difficult with the real data since the double-entries only happen once every few thousand times round the cycle...).

Thanks for your help so far - proving that my code can work with this construction is a positive step.

This, for what it's worth, is my test program - a button and 4 memo boxes (make them big enough or include scrollbars) are needed, I hope the rest is self-explanatory.

Code:
unit Unit1;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, StdCtrls;

type
  TForm1 = class(TForm)
    Button1: TButton;
    Memo1: TMemo;
    Memo2: TMemo;
    Memo3: TMemo;
    Memo4: TMemo;
    procedure FormCreate(Sender: TObject);
    procedure FormDestroy(Sender: TObject);
    procedure Button1Click(Sender: TObject);
    procedure ProcessLine2();
  private
    { Private declarations }
  public
    { Public declarations }
  end;

var
  Form1: TForm1;
  List1, List2, List3, Line1, Line2: TStringList;
implementation

{$R *.dfm}

procedure TForm1.FormCreate(Sender: TObject);
begin
  List1 := TStringList.Create;
  List2 := TStringList.Create;
  List3 := TStringList.Create;
  List1.Add('ban,yellow,Sainsburys');
  List2.Add('banana=banana,yellow,75p');
  List3.Add('ban=banana');
  List1.Add('app,red,Waitrose');
  List2.Add('apple=apple,red,40p');
  List3.Add('app=apple');
  List1.Add('pep,green,Sainsburys');
  List2.Add('pepper=pepper,green,75p');
  List3.Add('pep=pepper');
  List1.Add('pep,yellow,Tesco');
  List2.Add('pepper=pepper,yellow,60p');
  List1.Add('cab,green,Lidl');
  List2.Add('cabbage=cabbage,green,99p');
  List3.Add('cab=cabbage');
  Memo1.Lines.Clear;
  Memo2.Lines.Clear;
  Memo3.Lines.Clear;
  Memo4.Lines.Clear;
  Memo1.Lines.Add('Contents of List1:');
  Memo1.Lines.AddStrings(List1);
  Memo2.Lines.Add('Contents of List2:');
  Memo2.Lines.AddStrings(List2);
  Memo3.Lines.Add('Contents of List3:');
  Memo3.Lines.AddStrings(List3);
  Memo4.Lines.Add('Output:');
  Line1 := TStringList.Create;
  Line2 := TStringList.Create;
end;

procedure TForm1.FormDestroy(Sender: TObject);
begin
  List1.Free;
  List2.Free;
  List3.Free;
  Line1.Free;
  Line2.Free;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  i, Position: integer;
  Target: string;
begin
  for i := 0 to List1.Count - 1 do begin
    Line1.CommaText := List1[i];
    Target := List3.Values[Line1[0]];
    {extra code needed if nothing found in List3}
    Position := List2.IndexOfName(Target);
    If Position > -1 then begin
      Line2.CommaText := List2.Values[Target];
      ProcessLine2;
      List2.Delete(Position);
    end;
  end;
end;

Procedure TForm1.ProcessLine2();
var
  OutLine: string;
begin
  OutLine := Line2[0] + ',' +
             Line1[1] + ',' +
             Line2[1] + ',' +
             Line1[2] + ',' +
             Line2[2];
  Memo4.Lines.Add(OutLine);
end;
end.

Michael Hooker
 
Tracked down the problem after setting up some dummy input files of manageable size (20 records) and code to save various StringLists to files as they were amended. Printing out these files and comparing them immediately showed what was happening.

The StringList from which the search string which eventually became Target was taken was being sorted (unnecessarily, it seems) at some stage between being imported and my routine, so in fact what should have been the second instance of the same string was being presented to my routine first. And my routine was behaving as it should have done and matching it up to the first instance in the StringList being searched.

Thanks again for the help. It's always good to rule out the possibility of an "undocumented feature" in Delphi or an oversight in the Help before spending hours and hours trying to make something work that never will.

Michael Hooker

 
Thanks for posting the solution (its really annoying when some posters don't do that). I'm glad you've found the cause of the error and that it wasn't a VCL bug.

When confronted with a similar kind of problem it is usually a good idea to code the simplest possible program that will reproduce the error.

First, and probably most important, coding the simplified program will often lead you to find the real cause of the problem.

Second, because you can show the source code on Tek-Tips so that others can actually reproduce the effect.



Andrew
Hampshire, UK
 
You're welcome. Before I retired a ran a sort of help-desk myself, in a completely different field, and I now seem to spend half my retired life answering questions in another completely different field, so I know how frustrating it is when nobody tells you what the outcome was.

A problem I've encountered before in fora like this is the person who doesn't address the question but makes (sometimes even helpful) suggestions how to achieve the goal another way. Fine, but the question was why my code doesn't work, and generally I chose my way of doing it because others were impractical in the light of other issues it would take a month of Sundays to explain. Then you end up having to explain why you can't use/don't like their solution and they lose interest in what the problem was in the first place. My last question here came into that category, and I still don't know how to skip back to the top of a loop full of if..else if.. statements for the next iteration through the loop when one of the conditions is met. Sadly the Delphi version of "case" is about as useful as a concrete banana in the context of my program. I was being told that my coding resulted in all the tests being run through (which I knew, because it also resulted in big problems and hence my question) but not why this was happening (and I still don't know what I was doing wrong). I shall come back to this one again in a separate thread using the approach you suggest, because I really would like to know the answer. Though it will take a <lot> of time to think up some comprehensible example data and tests which show what is happening because the real world data and tests are mind-bogglingly obscure and nobody would be able to tell that the output is wrong.

Michael Hooker
 
The StringList from which the search string which eventually became Target was taken was being sorted (unnecessarily, it seems) at some stage between being imported and my routine, so in fact what should have been the second instance of the same string was being presented to my routine first.
I'm not so sure that this is what's happenning, especially because there is a better answer. When you use a for loop, if the optimizer can figure out the limits and increment, it reverses the order through the loop (i.e., for i := 1 to 10 becomes for i := 10 to 1 step -1) to produce more efficient code. You can observe this using the debugger. I haven't compiled your code, but you might want to try this.
 
Thanks, harebrain.

I'm not so sure that this is what's happenning, especially because there is a better answer.

Would this explain why commenting out the line that unnecessarily sorted the source StringList solved the problem?

Michael Hooker
 
It very well could be. I can't really tell without running your program in debug mode, and I don't have a Delphi-equipped machine at hand. :-(
 
I meant to add (and forgot in both previous posts) that there is an option to turn off this behavior. You'll have to look through the {x} flags for it, as I don't have any documentation handy, either.

This would be another way to test the hypothesis that it is the for-loop optimization that was responsible for the odd behavior.
 
Um. In the "sorted" order the items were coming out in the wrong order because alphabetically they were in the wrong order. In the unsorted, natural, order the items came out in the right order because they were in the right order. As I explained, I saved the actual stringlists to file and printed them out. Normally it wouldn't make any difference, but at random times every few thousand entries there are two items with the same key - something I didn't know would happen when I started writing my program, and so didn't know would be a problem if the lists were sorted.

I hate to waste time on explaining the detail unnecessarily, but this relates to aircraft arriving at and departing from Heathrow - there are two different systems for recording the details, and they use different codes, with some details in one system and some in the other. My program matches up entries in one system with the relevant entries in the other and produces a list in the format used by the second system containing all the details. Now, when a daily flight arrives too late to leave again before the noise curfew it has to wait until the following morning - so on the second day we will get two flights with the same flight number departing. System one differentiates the flights by adding a "Y" to the end of the delayed one (if they remember) - so QANTAS flight 2 to Sidney would normally be QF002 but if it departs a day late the late flight becomes QF002Y and in theory at least it goes before the normal QF002. (System 2 doesn't differentiate at all and calls them both "QFA2" which isn't really playing fair.) Now if both lists are sorted in time order - which is the natural order they come in, then it doesn't matter how many flights use the same number, they'll always be in the same order in both lists - furthermore QF002Y and QF002 are equal as far as my matching routine is concerned because they both have to match to a QFA2 - the "Y" is dropped for the purposes of the comparison. Still with me?

Consider what happens when list 1 is sorted alphabetically. QF002Y then comes after QF002, not before it as it should. But if list 2 is sorted alphabetically it so happens that the two instances of QFA2 still come out in time order, because the time is the second field: "QFA2,09:15" is always going to come before "QFA2,21:15".

So, with all due respect, I think optimisation wasn't a factor here. The loop was behaving as expected. And in fact I did follow it through in the debugger using the limited test data and the loop was iterating in the right order.

One of the reasons we all use for loops is because there is a nice integer which increments or decrements as instructed (in theory), which is just the ticket for many jobs. If it can't be relied on I guess we can resort to a while loop, setting a counter to 0 to start with and incrementing it at the right point until it reaches the number of items in the list -1. That counter is then used to access the relevant string in the list. Or does optimisation mess that up too? I'd rather not mess about with {x} stuff...

Michael Hooker
 
The compiler tries to be smart about applying this optimization, and generally it is. On rare occasions, there are unintended consequences, so we need to be aware of what is happenning "under the hood." If this optimization has an adverse affect on our code, we have the option to turn it off. Apparently such was not the case here.

The for-loop compiler optimization exists because decrementing a register is faster than incrementing it. What, in general terms, does a compiler do? It uses syntax rules to convert a higher-level language to a lower-level language. It doesn't "mess" with the logic expressed in your code.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top