Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate the use of warp's hash string algorithm (in D) #94

Open
radare opened this issue Feb 1, 2016 · 0 comments
Open

Evaluate the use of warp's hash string algorithm (in D) #94

radare opened this issue Feb 1, 2016 · 0 comments

Comments

@radare
Copy link
Collaborator

radare commented Feb 1, 2016

original code from https://github.com/facebookarchive/warp

    static size_t getHash(const(uchar)[] name)
    {
        size_t hash = 0;
        while (name.length >= size_t.sizeof)
        {
            hash = hash * 37 + *cast(size_t*)name.ptr;
            name = name[size_t.sizeof .. $];
        }
        foreach (c; name)
            hash = hash * 37 + c;
        hash ^= (hash >> 20) ^ (hash >> 12);
        return hash ^ (hash >> 7) ^ (hash >> 4);
    }

in C would be something like:

size_t getHash(const char *str) {
  size_t hash = 0, ptr = str;
  while(strlen(ptr) >= sizeof(size_t)) {
    hash = hash * 37 + *ptr;
    ptr++;
  }
  for (str = ptr; *str; str++) {
    hash = hash * 37 + *str;
  }
  hash ^= (hash >> 20) ^ (hash >> 12);
  return hash ^ (hash >> 7) ^ (hash>> 4);
}

the strlen can be optimized by checking if any of the Nth chars is null , where N is constant for sizeof (return_type), in theory this hash have less collisions, but right now we have no reproducible collisions in the current string hashing algorithm used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant