c - Optimised arraylist -


i have program needs comparisons of strings in array. first thought 1 have of course use strcmp check whether 2 strings in array same. consider option need compare pointers strings. involve preparations map each element literally same same place in memory.

i've done this, preparing strcmp, , strstr (which believe faster). because need check every string map them first occurences, horribly long preparation-times. should mention array several mb large.

here example of want do:

[0x0: "i", 0x1: "am", 0x2: "done", 0x3: "here.", 0x4: "i", 0x5: "have", 0x6: "done", 0x7: "everything!"]  [0x10: 0x0, 0x11: 0x1, 0x12: 0x2, 0x13: 0x3, »0x14: 0x0«, 0x15: 0x5, »0x16: 0x2«, 0x17: 0x07] 

so question: there way kind of mapping faster doing?

if it's see whether duplicates exist... run qsort() on array of strings , if sort function finds duplicate, can bail early. or if need remove duplicates, let sorting complete , linearly iterate bottom of list , pull them out find them (since duplicates next each other).

if strings relatively different, strcmp() realistically need check first handful of characters before bailing on failed match. may not bad you'd think.

granted, ease of doing depends on how strings stored in memory.

update:

ok, based on update... matt's suggestion of hash table work best:

  • iterate through list one-by-one
  • hash string
  • check see if exists in table already
  • if not, add table , proceed
  • if so, use existing index table
  • ... , proceed next.

i'd imagine should give relatively decent performance, overall.


Comments

Popular posts from this blog

image - ClassNotFoundException when add a prebuilt apk into system.img in android -

I need to import mysql 5.1 to 5.5? -

Java, Hibernate, MySQL - store UTC date-time -