DevX Home Today's Headlines   Articles Archive   Tip Bank   Forums

1. Registered User
Join Date
Apr 2011
Posts
13

## help applying formulae

Hello all

Code:
```   col(1)                col(2)     col(3)
6178/32032003,    98925,        2
6178/32033001,    98920,        1
6178/32033003,    98925,        5
6178/32033004,    98925,        3
EXAMPLE
6178/32034001,    98925,        3
6178/32034001,    98920,	1
END
6178/32034003,    98925,	8
6178/32034004,    98925,	4
6178/32035001,    98925,	1
6178/32035004,    98925,	1
6178/32036001,    98925,	2

col(1)	            =  (col(2) x col(3)) + (col(2) x col(3)) / (col(3) + col(3))
Formula 1) avg col(2) for 6178/32034001  =  (98925  x 3)      + (98920  x 1)      / (3      + 1)

(col(2) x total of col(3)) + (col(2) x total of col(3)) / (total of col(3) + total of col(3))
Formula 2) total avg col(2) = (98925  x 29)              + (98920  x 2)               / (             29 + 2)```
a) The above CSV strings is formed of variables of a struct and each struct object is stored as an element of a vector, the vector in this case would be having 11
elements where each element is formed of 3 struct variables

b) Formula 1): col(2) values against each col(1) value that appears more than once e-g "6178/32034001" appears twice
- col(2) value against the first appearance of "6178/32034001" is "98925" and col(2) value against the second appearance of "6178/32034001" is "98920"
- col(3) value against "98925" is "3" and col(3) value against "98920" is "1"

c) Formula 2): sum of col(3) values against each of the col(2) values
- sum of values of col(3) that appear against "98925" i-e col(2) is 29 in above case (2+5+3+3+8+4+1+1+2 = 29)
- sum of values of col(3) that appear against "98920" i-e col(2) is 2 in above case (1+1 = 2)

Many Thanks

2. Senior Member
Join Date
Dec 2003
Posts
3,366
Your example is pretty short, so you need to know for yourself if the data is always sorted or organized in any way ahead of time or if you need to do that yourself. If it is NOT, you need to sort the data by col 1 first.

Once it is sorted, it just becomes a set of loops over the data. Since it is sorted, looping over col1 ensures that if the next data and the current data are the same (or current and previous if you prefer that logic) if there is any match of that data point, so you can simply apply the formula as needed using that idea, saving the new data in a new column or whatever you like.

Same idea works for C though you should resort the data off column 2.

------------
There is a more annoying way to do it where you keep track of each data point manually rather than sort but there is no reason to do it -- you spend as much time checking to see if your current is something you have already seen before as you would have sorting it ahead of time and the algorithm is a lot more complex. Its almost never a good idea to do it that way unless your data set is very unusual, for example if you only had 2 possible values in col 1, it would be better to do it this way as the algorithm is simple enough and it would process large data sets fast.

3. Registered User
Join Date
Apr 2011
Posts
13
Thanks for your help Jonnin :)

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•

 FAQ Latest Articles Java .NET XML Database Enterprise