My dataset looks like this
ID X
1001 5
1002 3
. .
. .
1002 4
1003 3
1003 6
1003 4
. .
. .
1003 3
1004 2
. .
. .
. .
Now I need only one ID and X value in each row. For example I need the sum of X values for the ID number 1002.
The problem is that the sequence always change. Sometimes I see the same Id number in the next three lines sometimes in the next 8 lines.
ID X
1001 5
1002 3
. .
. .
1002 4
1003 3
1003 6
1003 4
. .
. .
1003 3
1004 2
. .
. .
. .
Now I need only one ID and X value in each row. For example I need the sum of X values for the ID number 1002.
The problem is that the sequence always change. Sometimes I see the same Id number in the next three lines sometimes in the next 8 lines.