Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Shuffle lists 1

Status
Not open for further replies.

Macdadi112

Programmer
May 25, 2016
11
GB
Hi,

I was wondering if you could help as I'm sure its possible. I can remember something was possible using sed/awk a while back.

I have a list in a file such as sample_list.txt which contains the following:


Code:
example1, answer1
example2, answer2
example3, answer3
example4, answer4
example5, answer5
example6, answer6
example7, answer7
example8, answer8
example9, answer9
example10, answer10
e.t.c

What I want to do is to create a script and when it runs:

1) it lists either the first or second field and places it on the left hand side
2) It then places its corresponding example/answer on the opposite side
3) It repeats this for 10 of the words and then shuffles the words

example:


Code:
example8 answer1
example10 answer6
example1 answer8
answer3 example5
example7 answer9
example9 answer7
answer2 answer9
example6 example3
example4 example2
answer5 answer4

Hope this helps.

Thanks
 
Hi

Honestly, I not get the logic in your 3 point to do list. I would do it like this :
Code:
{
    data[NR, 1] = $1
    data[NR, 2] = $2
}

END {
    srand()
    [b]for[/b] (i = 0; i < 10; i++)
        [b]print[/b] random(1), random(2)
}

[b]function[/b] random(column)
{
    [b]do[/b] {
        number = int(rand() * NR + 1)
    } [b]while[/b] (! (result = data[number, column]))
    data[number, column] = [green][i]""[/i][/green]
    [b]return[/b] result
}


Feherke.
feherke.ga
 
For some reason I am trying to do it at home.. I cant get it to run.

I am using Cygwin and how do invoke the above script ??
 
Hi

Oops. Sorry. The original code I wrote on command line so used single letter variables, which made it hard to read. So while posting I renamed the variables. Except 2 occurrences of d. Edited my previous post.

Regarding the invocation, put the Awk code in a file, for example shuffle_list.awk, then run it like this :
Code:
awk -f /path/to/shuffle_list.awk /path/to/sample_list.txt

Feherke.
feherke.ga
 
Macdadi112 said:
I tried the above and it seems to be hanging... sad
I tried the feherke's example and it's working excellent. You must have an technical problem ...
 
Hi

Macdadi112 said:
I tried the above and it seems to be hanging...
The code as I posted it tries to extract 10 random values from each column's values. Are you sure you feed the script with at least 10 values[blue](*)[/blue] ?

[small][blue](*)[/blue] Values must be non-empty ( not 0, 0.0 or "" ) as consumed values are set to "" to avoid returning duplicates. This can be changed if needed.[/small]

Feherke.
feherke.ga
 
I get the same thing :-(

okay what i've got is the code in a script and it looks like:

[pre] cat tektips.awk
{
data[NR, 1] = $1
data[NR, 2] = $2
}

END {
srand()
for (i = 0; i < 10; i++)
print random(1), random(2)
}

function random(column)
{
do {
number = int(rand() * NR + 1)
} while (! (result = d[number, column]))
d[number, column] = ""
return result
}
[/pre]

then I've a sample file with the following info:

[pre]
$ cat sample.txt
example1, answer1
example2, answer2
example3, answer3
example4, answer4
example5, answer5
example6, answer6
example7, answer7
example8, answer8
example9, answer9
example10, answer10

[/pre]

and then I run the following command:

[pre]
$ awk -f tektips.awk sample.txt


[/pre]

It just sits there...
where or what am I doing wrong??
 
just a little afterthought.

How would I remove the , between the words.
 
Hi

Macdadi112 said:
How would I remove the , between the words.
By default Awk splits the record into fields on sequences of one or more whitespace characters. So those commas in the output are part of the 1st field. To get rid of them simply use a field separator that consumes those commas[blue](*)[/blue] :
Code:
awk -F ', *' -f tektips.awk sample.txt
Or set it in the script file so you not have to specify it on the command line each time you run it :
Code:
BEGIN {
    FS = [green][i]", *"[/i][/green]
}

[gray]# ... and the old script follows here...[/gray]
[small][blue](*)[/blue] The expression I used says to split on comma followed by zero or more spaces.[/small]

Feherke.
feherke.ga
 
Now that I've done that..

Now I want there to be a header of information. approx 3 lines.
How would I add this to the file.. the thing I could just echo into a new file and append the results of the awk statement..

what if I wanted to convert the file to pdf.

I remember there was something to convert it to pdf...??
 
Hi

Macdadi112 said:
How would I add this to the file..
More exactly you wish to add the header lines to sample.txt input file, then the Awk script should just preserve the first 3 lines and shuffle only from the rest of lines ?

The simplest is to just output them without saving :
Code:
BEGIN {
    FS = [green][i]", *"[/i][/green]
}

NR <= 3 {
    [b]print[/b]
    [b]next[/b]
}

[gray]# ... and the old script follows here...[/gray]

As mentioned in my post at 26 May 16 07:22, the consumed values are set to "" to avoid returning duplicates. From the script's point of view there will be no difference if an item in the data array is empty because was already picked randomly and outputted, or pertains to a header line and was never added to the data array. When randomly picking a value results an empty one, just tries picking another.


Feherke.
feherke.ga
 
Cheers for that.

What if there are more than 2 variables per line.

For example:

[pre]
example1, answer1
example2, answer2
example3, answer3
example4, answer4, answer4b, answer4c
example5, answer5, answer5b
example6, answer6
example7, answer7
example8, answer8, answer8b
example9, answer9
example10, answer10, answer10b
[/pre]

and the output would remain the same??
 
Hi

Macdadi112 said:
What if there are more than 2 variables per line.
Thought of that when decided to store data in 2 dimensional array. But the script still needs a couple of changes :
Code:
BEGIN {
    FS = [green][i]", *"[/i][/green]
    maxfield = 0
}

NR <= 3 {
    [b]print[/b]
    [b]next[/b]
}

{
    [b]for[/b] (i = 1; i <= NF; i++)
        data[NR, i] = $i

    [b]if[/b] (maxfield < NF)
        maxfield = NF
}

END {
    srand()
    [b]for[/b] (i = 0; i < 1; i++) {
        [b]for[/b] (j = 1; j <= maxfield; j++)
            $j = random(j)

        [b]print[/b]
    }
}

[b]function[/b] random(column)
{
    [b]do[/b] {
        number = int(rand() * NR + 1)
    } [b]while[/b] (! (result = data[number, column]))
    data[number, column] = [green][i]""[/i][/green]
    [b]return[/b] result
}
Note that now the script is able to output at most that many line as many input lines have the maximum field count. I mean, in the example you posted the maximum field count is 4 and there is only 1 row containing 4 fields. So only up to 1 shuffled output lines can be requested, after that will enter infinite loop.


Feherke.
feherke.ga
 
this is what i've got in the sample.txt

[pre]
example1, answer1
example2, answer2
example3, answer3
example4, answer4, answer4b, answer4c
example5, answer5, answer5b
example6, answer6
example7, answer7
example8, answer8, answer8b
example9, answer9
example10, answer10, answer10b


[/pre]

and when I run the program above, I get:

[pre]
$ awk -f tektips.awk sample.txt
line 1
line 2
line3
example10 answer9 answer8b answer4c

[/pre]

What i would like it do is to ognore the first three lines and then produce an example/answer for each number line.
The number line could have a number of fields??


Thanks
 
Hi

You mean you want the 3 header lines discarded without outputting them ? Then just comment out ( or remove completely ) the [tt]print[/tt] used for those 3 lines :
Code:
NR <= 3 {
[gray]#    print[/gray]
    [b]next[/b]
}

Or if you are sure you will never want to do anything with the header lines in the future, you can remove that action completely and move the condition to the next action :
Code:
BEGIN {
    FS = [green][i]", *"[/i][/green]
    maxfield = 0
}

NR > 3 {
    [b]for[/b] (i = 1; i <= NF; i++)
        data[NR, i] = $i

    [b]if[/b] (maxfield < NF)
        maxfield = NF
}

[gray]# ... and the old script follows here...[/gray]

Macdadi112 said:
then produce an example/answer for each number line.
The number line could have a number of fields??
Sorry, I not understand this. By "number line" you mean any of the input lines except the 3 header lines ?

Or maybe the problem is older and I not understood correctly neither the meaning of the newly appeared fields.
Macdadi112 at 27 May 16 10:24 said:
and the output would remain the same??
Here I supposed "same" means same rule : output as many fields of distinct values, as many fields there were in the input.


Feherke.
feherke.ga
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top