I am student taking a computer science course, so i was need some help. Anyway i need to find the Mode of a single dimension array. I am only allowed to find the mode using only the array that i pass into the method as a parameter. here is what i have done so far and it is not working, can you have a look and tell me where i am going wrong.
public static double findMode(int y[])
{
int mode = 0;
int counter=0, counter2=0;
for (int i=0;i<totalNumbers;i++)
{
for(int j=0; j<100; j++)
{
int temp = y;
You should use a hashtable for this, and store the values from the array as Integer
objects there, mapping them like value ->count. So as you loop through the array
you check the hashtable if the current integer exists in the hashtable, if it does you
increment its count (also an Integer object), if not you store a new value->count (1)
in the hashtable.
Then you can iterate through the hashtable finding the highest count, or sort it on
count value.
Btw, mode is "The most common value obtained in a set of observations",
so why do you want to return a double, when checking an int array...?
Originally Posted by nativejam, slightly reformatted
Code:
public static double findMode(int y[]) {
int mode = 0;
int counter = 0, counter2 = 0;
for (int i=0; i<totalNumbers; i++) {
for(int j=0; j<100; j++) {
int temp = y;
if (temp==j) {
counter++;
}
if(counter>counter2) {
counter2 = counter;
mode = temp;
}
}
}
return mode;
}
The blue things have already been mentioned; only that: did you mean y.length instead of totalNumbers?
The red things are confusing me:
1. where does this 100 come from? Is it a maximum value that you expect to be an upper bound of the elements of y? If so, Y?
If not, is it rather y.length?
2. the assignment int temp = y; cannot work since temp is of type int whereas y is of type int[].
Did you mean int temp = y[i]; or rather int temp = y[j];?
I'm wondering what this code is computing. I'm wondering also if computing the mode in sjalle's sense can be done in linear time and better than linear space...?
(It definitely can be done in linear time AND space, but the code above - if at least type corrected - seems to have quadratic or O(n*m) time complexity; n being the number of values and m being the size of the range of values)
I timed it to 4,3 sec on a 1.8gHz machine to find the mode in an array of
10000000 random integers with values varying from 1 to 1000, but I guess
this metod is just brute force..., I have attatched the source
Hm, seems like some words on the notion of computational complexity could be dropped here...
But I won't do that now, only this:
What you have implemented is the worst-case linear-time (in n) linear-space (in m<=n) algorithm which, as far as I can see by now, is optimal (ie. optimal in the sense of worst-case complexity). I have not been quite exact in what I was chatting around above, sorry. What I had in mind was average-case complexity, dependant on the frequency distribution of the values. In your code, the values are distributed uniformly as a consequence of the use of Random.nextDouble(). BTW: min + Random.nextInt(max-min) does the job as well.
But if you consider eg. a normal (Gaussian) distribution, it is very likely that towards the end of the array you can tell that no value can occur more often than the "mode" obtained by now anymore - since there weren't simply enough values left. So you could as well stop and output the mode, although you would then only know lower and upper bounds for its frequency rather than the exact number of occurences.
Also, you could do the update-frequency-counting and find-max-occurences in the same loop, by keeping variables for both, the maximum and the maximum-but-one.
As said, all this is about the average case and that remains to be defined/analyzed. The in-same-loop thing doesn't even touch any complexity - yet may yield a (constant, therefore no complexity issue) speed-up up to the factor 2 I think.
Hm, not very concise again, is it? Anyway, just some ideas. If there's some time left I may try 'em. Or not...
I agree that this is a 'no brain, no pain' solution, but from a machine-time point of view
the algorith of finding out that the job is done prior to the end of the array may prove
to crave processing that will make it a slower algorithm, but....I haven't tried.
Hey no! You're solution definitely ain't "no brain" - at last you're using a hashtable with approx. constant search cost (wrt time). One could do much worse. To say it again: I think this is optimal in the sense of worst-case-complexity.
As for the trade-off of my ideas - seems someone's gotta analyze it...