min-wise independent linear permutations
The electronic journal of combinatorics, Tome 7 (2000)
A set of permutations ${\cal F} \subseteq S_n$ is min-wise independent if for any set $X \subseteq [n]$ and any $x \in X$, when $\pi$ is chosen at random in ${\cal F}$ we have ${\bf P} \left(\min\{\pi(X)\} = \pi(x)\right) = {{1}\over {|X|}}$. This notion was introduced by Broder, Charikar, Frieze and Mitzenmacher and is motivated by an algorithm for filtering near-duplicate web documents. Linear permutations are an important class of permutations. Let $p$ be a (large) prime and let ${\cal F}_p=\{p_{a,b}:\;1\leq a\leq p-1,\,0\leq b\leq p-1\}$ where for $x\in [p]=\{0,1,\ldots,p-1\}$, $p_{a,b}(x)=ax+b\pmod p$. For $X\subseteq [p]$ we let $F(X)=\max_{x\in X}\left\{{\bf P}_{a,b}(\min\{p(X)\}=p(x))\right\}$ where ${\bf P}_{a,b}$ is over $p$ chosen uniformly at random from ${\cal F}_p$. We show that as $k,p \to\infty$, ${\bf E}_X[F(X)]={{1}\over {k}}+O\left({(\log k)^3}\over {k^{3/2}}\right)$ confirming that a simply chosen random linear permutation will suffice for an average set from the point of view of approximate min-wise independence.
DOI :
10.37236/1504
Classification :
60C05, 68P20, 62G20, 68Q17
Mots-clés : min-wise independent linear permutations, detection and filtering of near-duplicate documents, AltaVista Web index algorithm, approximate min-wise independence
Mots-clés : min-wise independent linear permutations, detection and filtering of near-duplicate documents, AltaVista Web index algorithm, approximate min-wise independence
@article{10_37236_1504,
author = {Tom Bohman and Colin Cooper and Alan Frieze},
title = {min-wise independent linear permutations},
journal = {The electronic journal of combinatorics},
year = {2000},
volume = {7},
doi = {10.37236/1504},
zbl = {0949.60016},
url = {http://geodesic.mathdoc.fr/articles/10.37236/1504/}
}
Tom Bohman; Colin Cooper; Alan Frieze. min-wise independent linear permutations. The electronic journal of combinatorics, Tome 7 (2000). doi: 10.37236/1504
Cité par Sources :