Need help sorting and extracting from a file - Peculiar issue

mlapse · Aug 9, 2005

Hi All,

I face a peculiar issue with the data from one of our partners.

They are sending the data in the following format (each line is one record):

A12345
A1234586
A1
B2843
A2395
A98
A12392
B986
B97352
...
...

The first character is an identifier. Now my requirement is to extract the data for all identifiers which occurs last in the file.

In the above example I should be extracting only A98 and B97352.

I am totally lost as to how to go about this.

Could someone provide some pointers.

Thanks

mlapse · Aug 9, 2005

Please note there is no sorting of the file. The only thing we know is that the latest data appears later in the file.

JuJutsu · Aug 10, 2005

There are probably a number of ways to do this. You can extract all the A's and B's and use the COUNT function to figure out how many objects and use the CHOOSE function to get the one you want.

JuJutsu - Jeff S.
Support Analyst

mlapse · Aug 10, 2005

How would I be able to use the CHOOSE command to extract only the last occurrence of an A.

I am able to extract all the A's and Count them. But how can I use the CHOOSE command. I have used the following function. However the output is coming blank when I run the maps.

=CHOOSE(EXTRACT(ElemDecl Orderbook Element:In2, ORDER~NUMBER~KEY:ElemDecl ORDER~HEADER:In1 = ORDER~NUMBER~KEY:ElemDecl ORDER~HEADER:ElemDecl Orderbook Element:In2),

COUNT(EXTRACT(ElemDecl Orderbook Element:In2, ORDER~NUMBER~KEY:ElemDecl ORDER~HEADER:In1 = ORDER~NUMBER~KEY:ElemDecl ORDER~HEADER:ElemDecl Orderbook Element:In2)))

The choosing is done from the extract of the series containng all the matching data. The count is used to determine the number of occurrences which are passed as the position in to the CHOOSE command.

What is wrong with this?

eyetry · Aug 10, 2005

confused... data looks like....

A12345
A1234586
A1
B2843
A2395
A98
A12392
B986
B97352
...
...

requirement... 'extract the data for all identifiers which occurs last in the file'

example... 'I should be extracting only A98 and B97352'

isn't 'A12392' the last 'A' identifier? If you really need to grab 'A98'. I'm not sure how you'll do it.

I've found that many of these problems are tree based. How is your tree defined?

File:
Record-Group:
A-Record (0:S) initiator 'A'
number-element
B-Record (0:S) initiator 'B'
number-element

mlapse · Aug 10, 2005

sorry about that . you are correct . I need to grab A12392 and B97352 from the example.

The type tree is defined ok to denote S number of occurrences.

my problem is that I am not sure how to extract the last occurrence on the A type record and the last occurrence of the B type record.

basically I need to match a certain key (A or B) for each record with the remaining records of the file. The last record in the input for that particular key is what I need to take and write to the output.

Thanks!

Samji · Aug 11, 2005

Hi,

Can you just try writing the same rule in different levels.
1st level: Extract the objects and pass it to a function.
You can call two functions for A & B.
2nd level: Here you use Choose(series,count(series))

Hope it works.

JEMIAZ · Aug 11, 2005

Hi
Many solutions - this would work quickly if input file is not too large

Sort the input file by the first column on output card 1
Pass the sorted list into second map with control break logic on input tree

Input
Change_of_Field1_Group (s)
Record (s)
Field1
Field2
Use [Last] for each Change_of_Field1_Group to get the values you want

Tim

mlapse · Aug 11, 2005

Hi Tim,
Not sure what you mean by "Control break logic on input tree"

1) What is this feature and how do I do this?
2) How do I sort a file based on some column value. I thought the sorting in Mercator is done based on the whole record.

Samji,

I tried creating two separate functions but it doesnt work. I get errors like not enough memery to execute map. This seems strange coz the input contains only 5 records. I think the map goes into some sort of loop.

Samji · Aug 11, 2005

Hi,

I considered the input structure to be
A,1
A,345
A,4
.
.
.

Used the same structure for output also.
now at record level write a functional map as

=f_map(count(extract(Record:input,field1:Record:input = "A")),input)

you here the output for this funcmap will be 3,the input.

Inside the functional map
At the element level write the functio as following:

=f_lastelement(choose(Record:Record,texttonumber(count)))
So here the 3rd record A,4 will be passed.Then map the number part.

Though the tree stucture considered is different from yours.
I think that can be handled in the rule level
Since your data is like A458 and A is always 1 byte and starting position. Use mid function while Extracting the records.

=f_map(count(extract(Record:input,MID(field1:Record:input,0,1)) = "A")),input)

Hope this works.

JEMIAZ · Aug 15, 2005

Control Break logic is covered in detail in the type tree pdf but essentially allows you to specify an end to a block of data based on data values rather than structure. For example in a file of purchase order lines you could define an extra group that contained each group of purchase order numbers. Its not easy to explain! pls see book, its quite simple and powerful. It does require that the data is ordered though.

For sorting - yes data is sorted by record and I get around this be prefixing each record with the column I'm interested in before performing the sort.

Tim

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Need help sorting and extracting from a file - Peculiar issue

mlapse

IS-IT--Management

mlapse

IS-IT--Management

JuJutsu

Vendor

mlapse

IS-IT--Management

eyetry

Programmer

mlapse

IS-IT--Management

Samji

Programmer

JEMIAZ

Technical User

mlapse

IS-IT--Management

Samji

Programmer

JEMIAZ

Technical User

Similar threads

Part and Inventory Search

Sponsor