Help required


DevX Home    Today's Headlines   Articles Archive   Tip Bank   Forums   

Results 1 to 3 of 3

Thread: Help required

  1. #1
    Join Date
    Feb 2009
    Posts
    1

    Exclamation Help required

    Hey guys. I've got a question regarding binary and sequential search.

    If the list has 1024 items (lg1024 = 12) at what point (the number of searches) does sorting the list first and using binary search pay off?

    Thanks a lot and I was not sure if this belonged in the C++ forum.

  2. #2
    Join Date
    Jan 2005
    Location
    UK
    Posts
    604
    Well it's not strictly a C++ question, but the forum is certainly right, coz most of us had similar problems in the past.
    Generally binary search has a complexity of O(log N) and linear search has a complexity of O(N) which means basically the log N and N respectively are the most significant term in the equation. -> look up O() - notation on Wikipedia or so.
    the real cost for binary search could be something like c0*log(N) + c1
    the
    You are looking for a N_0 where the
    the real cost for binary search could be something like c0*log(N) + c1
    the real cost for linear search could be something like c2*N + c3
    you want to find a N0 where c0*log(N0) + c1 < c2*N + c3
    where c0,c1,c2,c3 are some constants that describe the effort for the overhead of your algorithm. The exact answer to your question depends on the implementation of your algorithms, and on the data you are using. Efficiency calculations of this sort are usually done for *BIG* numbers of N and for uniformly distributed search-values. and for such numbers binary search will almost always win (unless your are using de-generated search-values), so you know that such a N0 really exists. But for small values of N your best bet is just running a few tests...
    And here are a few questions you have to ask yourself:
    1024 *different* items? yes/no (duplicates allowed)?
    list of items sorted? => linear seach might be best always

    Hope that helps and gives you a few pointers...
    DKyb
    -------------------------------
    Life is a short warm moment -
    Death is the long cold rest.
    Pink Floyd
    -------------------------------

  3. #3
    Join Date
    Dec 2003
    Posts
    3,366
    In theory, and in short, sequential search can potentially search all the elements (1024 in your case) and can potentially find it on the first try too, and has an average time of 512 (50&#37; of the time, you find it in the first half, unless you know something more about the searches vs the "random" ordering).

    Also in theory and in short, a sort takes N (for certain types of data) time, or N * log N time for most ordinary sorts. A binary search takes log(n) searches.

    So, from that, we see these facts:
    1) the absolute best sort is the same as the worst sequential search, so if you only search one time, it does not pay to sort.

    2) for 1024, an N LOG N sort is 10 X 1024 operations, so you can do 10 searches before it pays to sort.

    3) at best, a couple more sequential searches can be done before a binary search begins to win out. So if you did 11 sequential searches, so what, you paid roughly 500 more operations which is nothing at all on a modern computer. 500, 1000, 1500, .... at some point, though, the binary search is a clear win. Maybe thats 20 searches, maybe its 11, thats up to YOU and how time critical your code really is.

    Honestly, for such a small list, sequential is probably fine if you are not on a small, weak computing device. If you are on such a device, a quick shell sort (not recursive, some small devices do not have room for recursion stacks etc) or a quicksort/std::sort etc can be used.

    For reference, the way to sort in N time is the integer sort, I think its called bucket or radix or something odd, but you basically set up an array of all zeros across the range of data that is valid (if you were sorting unsigned chars, for example, it would be int buckets[256] = {0};) and then iterate over your list and increment the array buckets (if you find a 23, then buckets[23]++) --- so one pass over the data and you have a sorted list in a funky format, at worst a second pass over the bucket array can be made to generate the actual real sorted list if the bucket format is not useful.

    Also for reference, I have sorted random lists of millions of doubles in under a second on a dual core machine.
    Last edited by jonnin; 02-24-2009 at 03:00 PM.

Similar Threads

  1. C++,Unix Professionals required for CADENCE India
    By sapphirecs in forum Careers
    Replies: 0
    Last Post: 12-06-2006, 02:36 PM
  2. Required Field Validation
    By spudmasher in forum ASP.NET
    Replies: 3
    Last Post: 08-09-2005, 02:19 AM
  3. Replies: 1
    Last Post: 09-07-2001, 04:03 PM
  4. Replies: 1
    Last Post: 09-07-2001, 04:03 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center
 
 
FAQ
Latest Articles
Java
.NET
XML
Database
Enterprise
Questions? Contact us.
C++
Web Development
Wireless
Latest Tips
Open Source


   Development Centers

   -- Android Development Center
   -- Cloud Development Project Center
   -- HTML5 Development Center
   -- Windows Mobile Development Center