Skip to content

Commit

Permalink
Squashed 'tommyds/' changes from 183b568..56fae10
Browse files Browse the repository at this point in the history
56fae10 Some more comments about the hash used
92d1e36 Update comments about integer hash functions
2762316 Add a note about the hash function used the benchmark
085d105 Make more clear that hash function we use
1180371 Split the googlelibchash and concurrencykit graph
c4c3660 Add a new list count operation with O(n) complexity
b3af524 Changes to version 2.1.
0823405 Updates the HISTORY.
37e36bf Drops doxygen dependency for Travis CI.
828e796 Uses a more complete Travis CI script.
6af2f22 Fixes another typo in the legend.
3389d11 Uses explict var for objdump and drop -fverbose-asm.
4d318ab Removes index and remainder uses as old gcc raise a warning for them.
055884d Fixes a typo in the legend.
a253eeb Removes compiler warnings due different type size.
c203b5c Moves the addressing functions of hashlin to the header.
6bbcc56 Allows to redefines CC and CXX with environment variables.
5eac92a Uses new compiler options for generating the benchmark.
3af910f Uses fixed arch options to build the benchmark.
56c897e Doesn't build by default the bench prog.
b60c7b6 Decreases the requirement of C++11 std. C++0x seems enough.
a8d61cb Uses gcc options valid for any native machine.
852031d Adds support for Travis CI.
ee7d007 Some documentation changes.
494a394 Default run for valgrind is the check program.
84fc45f Uses a simpler statement to compute the split bit in hashlin.
e3da5a4 Reimplements the bucket addressing for hashling.
1f3b338 More documentation changes.
7266227 Removes an unnecessary integer division in hashdyn.
7e1dac6 Various documentation changes.
21140d2 Adjusts the HISTORY with latest changes.
e10f5da Even more code coverage. Now at 100%.
9b37acc Removes some unnecessary parenthesis from the code.
aec0d2f Updates the design comments.
108f669 Adds more test coverage. Now at 99%.
3f9ef98 Fixes tommy_trie_inplace when removing duplicate elements.
b4719b6 Minor changes at the documentation.
6d1b159 Adds support for code coverage.
12143b9 Cleans up the benchmark source.

git-subtree-dir: tommyds
git-subtree-split: 56fae105c119defcedf219f9a4668342e84d370b
  • Loading branch information
andresy committed Mar 10, 2015
1 parent fdbe35e commit 795d28d
Show file tree
Hide file tree
Showing 32 changed files with 1,219 additions and 493 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,12 @@ stamp-h1
*.s
*.S

# coverage
*.gcda
*.gcno
*.info
cov/

# project
*.dst
*.epr
Expand Down
10 changes: 10 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Travis CI configuration file

language: c

script: make check

compiler:
- clang
- gcc

8 changes: 7 additions & 1 deletion HISTORY
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
TommyDS HISTORY
===============

2.0 2014/10
2.1 2014/12 BETA
================

2.0 2014/12
===========
* Fixed a Segmentation Fault bug in the trie_inplace container when inserting
duplicate elements.
* Faster array and hashlin implementation when accessing elements.
* Added new hashtable functions to iterate over all the elements.
* Added a new tommy_calloc() function used for allocating initialized memory.
If you redefined tommy_malloc(), likely you have to redefine also tommy_calloc().
* Reached 100% code coverage in the regression test.
* Different source code organization.
* Added benchmark comparison with Binary Search Tesseract by Gregorius van
den Hoven.
Expand Down
5 changes: 3 additions & 2 deletions INSTALL
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
TommyDS INSTALL
===============

TommyDS doesn't need any installation. You have only to import the required .c
and .h files into your program and start using them.
TommyDS doesn't need any installation.

You have only to import the required .c and .h files into your program
and use the them.

73 changes: 55 additions & 18 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,17 +1,30 @@
#############################################################################
# Tommy Makefile

VERSION=2.0
CFLAGS=-m32 -O3 -march=pentium4 -mtune=generic -Wall -Wextra -Wshadow -Wcast-qual -g
# -std=gnu++11 required by Google btree
CXXFLAGS=$(CFLAGS) -fpermissive -std=gnu++11
CC=gcc
CXX=g++
UNAME=$(shell uname)
# Version of TommyDS
VERSION = 2.1

# Build options for the check program
ifdef COVERAGE
CFLAGS = -O0 -g -fprofile-arcs -ftest-coverage
else
CFLAGS = -O3 -march=native -Wall -Wextra -Wshadow -Wcast-qual -g
endif

# Build options for the benchmark
# -std=gnu++0x required by Google btree
BENCHCXXFLAGS = -m32 -O3 -march=nehalem -fpermissive -std=gnu++0x -Wall -g

# Programs
CC ?= gcc
CXX ?= g++
OBJDUMP ?= objdump
UNAME = $(shell uname)

# Linux
ifeq ($(UNAME),Linux)
LIB=-lrt benchmark/lib/judy/libJudyL.a benchmark/lib/judy/libJudyMalloc.a
LIB=-lrt
BENCHLIB=benchmark/lib/judy/libJudyL.a benchmark/lib/judy/libJudyMalloc.a
EXE=
O=.o
endif
Expand All @@ -25,13 +38,13 @@ endif

# Windows
ifeq ($(UNAME),)
LIB=benchmark/lib/judy/src/judy.lib
BENCHLIB=benchmark/lib/judy/src/judy.lib
EXE=.exe
O=.obj
endif

CHECK = ./tommybench -N 1000000 -d tommy-hashlin
#CHECK = ./tommycheck
#CHECK = ./tommybench -n 1000000 -d tommy-hashlin
CHECK = ./tommycheck

DEP = \
tommyds/tommyalloc.c \
Expand Down Expand Up @@ -67,24 +80,43 @@ DEPTEST = \
check.c \
benchmark.cc

all: tommycheck$(EXE) tommybench$(EXE)
all: tommycheck$(EXE)

bench: tommybench$(EXE)

tommy$(O): $(DEP)
$(CC) $(CFLAGS) -c tommyds/tommy.c -o tommy$(O)
$(CC) $(CFLAGS) -S -fverbose-asm tommyds/tommy.c -o tommy.s
objdump -S tommy$(O) > tommy.S
$(OBJDUMP) -S tommy$(O) > tommy.s

tommycheck$(EXE): check.c tommy$(O)
$(CC) $(CFLAGS) check.c tommy$(O) -o tommycheck$(EXE) $(LIB)

tommybench$(EXE): benchmark.cc $(DEP)
$(CXX) $(CXXFLAGS) benchmark.cc -o tommybench$(EXE) $(LIB)
$(CXX) $(BENCHCXXFLAGS) benchmark.cc -o tommybench$(EXE) $(LIB) $(BENCHLIB)

check: tommycheck$(EXE) tommybench$(EXE)
check: tommycheck$(EXE)
./tommycheck$(EXE)
./tommybench$(EXE) -N 100000
echo Check completed with success!

lcov_reset:
lcov -d . -z
rm -f ./lcov.info

lcov_capture:
lcov -d . --capture -o lcov.info

lcov_html:
rm -rf ./cov
mkdir cov
genhtml -o ./cov lcov.info

coverage:
$(MAKE) COVERAGE=1 tommycheck$(EXE)
$(MAKE) lcov_reset
./tommycheck$(EXE)
$(MAKE) lcov_capture
$(MAKE) lcov_html

valgrind:
valgrind \
--tool=memcheck \
Expand Down Expand Up @@ -136,8 +168,9 @@ web: phony tommyweb.doxygen tommy.css $(DEP)
rm -f web/tab_*.png

clean:
rm -f *.log *.s *.S *.lst *.o
rm -f *.log *.s *.lst *.o
rm -f *.ncb *.suo *.obj
rm -f *.gcno *.gcda lcov.info
rm -rf Debug Release x64
rm -f callgrind.out.*
rm -f cachegrind.out.*
Expand Down Expand Up @@ -176,3 +209,7 @@ dist:
zip -r $(DIST).zip $(DIST)
rm -r $(DIST)

distcheck: dist
tar zxvf $(DIST).tar.gz
cd $(DIST) && make check
rm -rf $(DIST)
12 changes: 8 additions & 4 deletions README
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
TommyDS
=======

TommyDS is a C library of hashtables and tries designed for high performance.
TommyDS is a C library of array, hashtables and tries data structures,
designed for high performance and providing an easy to use interface.

It's faster than all the similar libraries like rbtree, judy, goodledensehash,
khash, uthash, nedtries and others.

The data structures provided are:

tommy_list - A double linked list.
tommy_array - A linear array. It doesn't fragment the heap.
tommy_array - A linear array. It doesn't fragment
the heap.
tommy_arrayblk - A blocked linear array. It doesn't fragment
the heap and it minimizes the space occupation.
tommy_hashtable - A fixed size chained hashtable.
tommy_hashdyn - A dynamic chained hashtable.
tommy_hashlin - A linear chained hashtable. It doesn't have the
problem of the delay when resizing and it doesn't fragment
the heap.
problem of the delay when resizing and it doesn't
fragment the heap.
tommy_trie - A trie optimized for cache utilization.
tommy_trie_inplace - A trie completely inplace.

Expand Down
68 changes: 23 additions & 45 deletions benchmark.cc
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ typedef size_t ssize_t;
/* Concurrency Kit Hash Set */
/* http://concurrencykit.org/ */
/* Note that it has a VERY BAD performance on the "Change" test, */
/* so we disable it in the general graphs */
/* so we disable it in the graphs until further investigation */
/* #define USE_CK */
#if defined(USE_CK) && defined(__linux)
/* if you enable it, ensure to link also with the -lck option */
Expand Down Expand Up @@ -392,24 +392,24 @@ struct nedtrie_t nedtrie;
khash_t(word)* khash;
#ifdef __cplusplus
/* use a specialized hash, otherwise the performance depends on the STL implementation used. */
class cpp_tommy_inthash_u32 {
class cpp_hash {
public:
unsigned operator()(unsigned key) const { return tommy_inthash_u32(key); }
unsigned operator()(unsigned key) const { return hash(key); }
};
#endif
#ifdef USE_CPPMAP
typedef std::map<unsigned, struct cpp_object*> cppmap_t;
cppmap_t* cppmap;
#endif
#ifdef USE_CPPUNORDEREDMAP
typedef std::unordered_map<unsigned, struct cpp_object*, cpp_tommy_inthash_u32> cppunorderedmap_t;
typedef std::unordered_map<unsigned, struct cpp_object*, cpp_hash> cppunorderedmap_t;
cppunorderedmap_t* cppunorderedmap;
#endif
#ifdef USE_GOOGLELIBCHASH
struct HashTable* googlelibhash;
#endif
#ifdef USE_GOOGLEDENSEHASH
typedef google::dense_hash_map<unsigned, struct google_object*, cpp_tommy_inthash_u32> googledensehash_t;
typedef google::dense_hash_map<unsigned, struct google_object*, cpp_hash> googledensehash_t;
googledensehash_t* googledensehash;
#endif
#ifdef USE_GOOGLEBTREE
Expand Down Expand Up @@ -2525,7 +2525,7 @@ void test(unsigned size, unsigned data, int log, int sparse)

printf("%12u", the_max);

printf(" %16s %16s", DATA_NAME[the_data], ORDER_NAME[the_order]);
printf(" %18s %10s", DATA_NAME[the_data], ORDER_NAME[the_order]);

/* skip degenerated cases */
if (LAST[the_data][the_order] > TIME_MAX_NS) {
Expand All @@ -2546,7 +2546,7 @@ void test(unsigned size, unsigned data, int log, int sparse)
if (the_log) {
for(i=0;i<OPERATION_MAX;++i)
printf(" %4u", LOG[the_retry][the_data][the_order][i]);
printf(" [ms]\n");
printf(" [ns]\n");
}
}
}
Expand Down Expand Up @@ -2600,34 +2600,14 @@ void test(unsigned size, unsigned data, int log, int sparse)
}
}

void test_cache_miss(void)
void help(void)
{
unsigned size = 512*1024*1024;
unsigned char* DATA = (unsigned char*)malloc(size);
unsigned delta = 512;
tommy_uint64_t miss_time;
unsigned i, j;
tommy_uint64_t result = 0;

memset(DATA, 0, size);

for(j=0;j<8;++j) {
tommy_uint64_t start, stop;
start = nano();
for(i=0;i<size;i += delta) {
++DATA[i];
}
stop = nano();

if (!result || result > stop - start)
result = stop - start;
}

free(DATA);

miss_time = result * delta / size;

printf("Cache miss %d [ns]\n", (unsigned)miss_time);
printf("Options\n");
printf("-n NUMBER Run the test for the specified number of objects.\n");
printf("-m Run the test for the maximum number of objects.\n");
printf("-d DATA Run the test for the specified data structure.\n");
printf("-s Use a sparse dataset intead of a compact one.\n");
printf("-l Logs results into file for graphs creation.\n");
}

int main(int argc, char * argv[])
Expand All @@ -2636,7 +2616,6 @@ int main(int argc, char * argv[])
int flag_data = DATA_MAX;
int flag_size = 0;
int flag_log = 0;
int flag_miss = 0;
int flag_sparse = 0;

nano_init();
Expand All @@ -2649,12 +2628,10 @@ int main(int argc, char * argv[])
} else if (strcmp(argv[i], "-s") == 0) {
flag_sparse = 1;
} else if (strcmp(argv[i], "-m") == 0) {
flag_miss = 1;
} else if (strcmp(argv[i], "-n") == 0) {
flag_size = MAX;
} else if (strcmp(argv[i], "-N") == 0) {
} else if (strcmp(argv[i], "-n") == 0) {
if (i+1 >= argc) {
printf("Missing data in %s\n", argv[i]);
printf("Missing number of objects in %s\n", argv[i]);
exit(EXIT_FAILURE);
}
flag_size = atoi(argv[i+1]);
Expand All @@ -2672,21 +2649,22 @@ int main(int argc, char * argv[])
}
}
if (flag_data == DATA_MAX) {
printf("Unknown data %s\n", argv[i+1]);
printf("Unknown data name '%s'\n", argv[i+1]);
printf("Possible values are:\n");
for(j=0;j<DATA_MAX;++j) {
printf("\t%s\n", DATA_NAME[j]);
}
exit(EXIT_FAILURE);
}

++i;
} else {
printf("Unknown option %s\n", argv[i]);
help();
exit(EXIT_FAILURE);
}
}

if (flag_miss) {
test_cache_miss();
return EXIT_SUCCESS;
}

test(flag_size, flag_data, flag_log, flag_sparse);

printf("OK\n");
Expand Down
2 changes: 1 addition & 1 deletion benchmark.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
<ConfigurationType>Application</ConfigurationType>
<CharacterSet>MultiByte</CharacterSet>
<WholeProgramOptimization>true</WholeProgramOptimization>
<PlatformToolset>v100</PlatformToolset>
<PlatformToolset>v110</PlatformToolset>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
Expand Down
3 changes: 2 additions & 1 deletion benchmark/gr_all.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ export GNUPLOT_DEFAULT_GDFONT=arial
gnuplot gr_def_random_hit.gnu
gnuplot gr_def_random_change.gnu
gnuplot gr_other_judy_problem.gnu
gnuplot gr_other_slow_problem.gnu
gnuplot gr_other_googlelibchash_problem.gnu
gnuplot gr_other_ck_problem.gnu

DIR=data/core_i5_650_3G2_linux
gnuplot $DIR/gr_def.gnu gr_forward_insert.gnu
Expand Down
2 changes: 1 addition & 1 deletion benchmark/gr_common.gnu
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ set style data linespoints
set datafile missing "0"
set xlabel "Number of elements in logarithmic scale"
set ylabel "Time for element in nanosecond in logarithmic scale\nLower is better"
set xrange [1000:100000000]
set xrange [1000:10000000]
set yrange [6:1000]
set logscale y
set logscale x
Expand Down
2 changes: 1 addition & 1 deletion benchmark/gr_def_random_change.gnu
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
load "gr_common.gnu"

tdir = "def/"
tsub = "\nCore i5 650 3.20 GHz, 4 MB L3 cache, 2400 Uncore Speed\nLinux, gcc 4.7.1, 32 bit"
tsub = "\nCore i5 650 3.20 GHz, 4 MB L3 cache, 2400 Uncore Speed\nLinux, gcc 4.9.2, 32 bit"

set output bdir.tdir."img_random_change".bext
set title "Random Change (Remove + Insert)".tsub
Expand Down
Loading

0 comments on commit 795d28d

Please sign in to comment.