1 |
How does exact match work? |
2 |
|
3 |
Well, it's quite stupid because swish-e doesn't allow you to make exact |
4 |
match to words. So, work-around is to add 'xxbxx' word at beginning of |
5 |
string and 'xxexx' word at end (think of it as xxb(egin)xx and xxe(nd)xxx) |
6 |
and then search for phrase (words in particular order). |
7 |
|
8 |
So, title "human" will be indexed as "xxbxx human xxexx" if you want full |
9 |
exact match. Then you can search it using (numbers are parameters to e[nr] |
10 |
field in html forms): |
11 |
|
12 |
1: exact match from beginning "xxbxx human" |
13 |
2: exact match from end "human xxexx" (not really useful) |
14 |
3: exact match begin and end "xxbxx human xxexx" |
15 |
|
16 |
add 4 to those values (numbers are really bit-masks :-) to produce wild-card |
17 |
match: |
18 |
|
19 |
5: exact from beginning with wild-card "xxbxx human*" |
20 |
6: exact from end with wild-card "human* xxexx" |
21 |
7: exact begin+end with wild-card "xxbxx human* xxexx" |
22 |
|
23 |
So, to define field which have to be searched using exact match with wild-card |
24 |
on TitleAndResponsibility, you would use: |
25 |
|
26 |
<input type="hidden" name="f1" value="TitleAndResponsiblity"> |
27 |
<input type="text" name="v1"> |
28 |
<input type="hidden" name="e1" value="5"> |
29 |
|
30 |
|
31 |
What are bit-masks? |
32 |
|
33 |
Bit-mask is usage of one byte (8 bits) as 8 separate bits with it's own |
34 |
meaning (this is simplification, but bear with me for now). |
35 |
|
36 |
So, 1 = 2^0, thus it's bit 1. With analogy, 2=2^1 and 3=2^0+2^1. |
37 |
So, for 1-3 we use two bits and have: |
38 |
|
39 |
number bits |
40 |
1 01 (just begin bit set) |
41 |
2 10 (just end bit set) |
42 |
3 11 (begin and end bit set) |
43 |
|
44 |
Thus, with two bits (and values 1-3) we can express should we exact match from |
45 |
beginning, end or both. For wild-card match, we use additional bit 3 (2^3 = 4) |
46 |
so we have: |
47 |
|
48 |
number bits exact match |
49 |
1 001 begin |
50 |
2 010 end |
51 |
3 011 begin+end |
52 |
4 100 (not used) |
53 |
5 101 (4+1) begin+wild-card |
54 |
6 110 (4+2) end+wild-card |
55 |
7 111 (4+3) begin+end+wild-card |
56 |
|