The Searching Algorithm can be implemented with other languages as well. But here we will work with JavaScript to solve the Searching Algorithms. These algorithms are used to handle or work with the search related task in our projects or any applications that require searching. Searching is made easier by these algorithms. Searching is fundamental to any programming language. It would be a waste to have data structures, and no way of searching through them.The two searching algorithms we will be looking at are Linear search, and Binary search.When data items are random or not sorted, we use sequential or linear search. When data is already sorted, we use binary search. Binary search only works when your data is sorted, and we must take into account the cost of sorting the data before we even begin searching through it. When writing code, performance is everything so we need to think through before we execute them. Let’s look at each algorithm.
1. Linear Search
This is the basic kind of search. Here, given a list of data, we start from the beginning of the collection and iterate through each item until we find what we are looking for. Suppose we had a dataset with 1000 items. Using Linear search it is actually possible that the item we are looking for is the 1000th item. So the worse case scenario is that we search 1000 items in order to find what we are looking for. The runtime complexity for linear search is O(n), which is a Big O notation representing the worst runtime for a given algorithm. The “n” you see stands for number of operations for a given search. Yes its possible that the item you are looking for could be the 1st item, but the truth is we really don’t know (its not sorted). If we find the item early then that’s awesome. But there is a possibility that its located at the last spot (or closer to the end), in which case we would have to go through all items before we find what we are looking for. It is also possible that we don’t find the item at all. In which case we would have also gone through 1000 items for nothing. This way of searching is the brute-force technique, because we could potentially visit all items.
Pseudocode
- Function accepts an array and a value.
- Loop through the array and check if the current array element is equal to the value.
- If it is,return the index at which the element is found.
- If the value is never found,return -1.
O(1) best,O(n) average
var linearSearch = function(city,value) {for(var i = 0;i < city.length;i++){ if(city[i] === value) return i; } return -1; } var city = ["kathmandu","pokhara","butwal","birgunj","mahendranagar","janakpur","dharan","narangadh","bhairawa"] console.log(linearSearch(city,"butwal")) Output:2
2. Binary Search
If we have a data set that is sorted, then binary search is a better way to search.You could find any item in a collection with 1000 records much faster. This algorithm is pretty impressive, however you can only use it if the data has already been sorted. Binary search works like how humans would do search if they were looking in a dictionary for example. If i asked you to search for the CODE in a dictionary, you wouldn’t turn to the ‘A‘ section and start your search there. You would start more towards the middle of the book (say around letter M). Based on whether the alphabet you land on is high or lower, you know which direction to search (higher or lower). Alphabets B and above are higher that N, you could simply ignore all those other alphabets (narrows you collection from 26 to 13. if the collection was 1 billion, you would have narrowed it down to 500 million). This is very powerful. You can almost reduce your data list in half just because you know which direction to search. In a number guess game (where you guess a number between 1-100), if you guessed a number 60 and that final number we are looking for is higher, then you know for sure that numbers 1 to 60 can be ignored. And you only guess higher number. This is similar to how binary search works. In a worst case scenario, it would take log2 n. This is logarithmic time or O(log n). Runtime for linear search was linear time O(n).
Pseudocode
- Function accepts a sorted array and a value
- Create a left pointer at the start of the array and a right pointer at the end of the array
- While the left pointer comes before the right pointer:
- Create a pointer in the middle.
- If you find the value you want,return the index.
- If the value is too small,move the left pointer up.
- If the value is too large,move the right pointer down.
- If you never find the value,return -1.
var binarySearch = function(arr,elem){ var start = 0; var end = arr.length - 1; var middle = Math.floor(start + end / 2); while(arr[middle] !== elem && start <= end){ if(elem<arr[middle]){ end=middle-1; }else{ start = middle+1; } middle = Math.floor((start+end)/2) } // if(arr[middle] === elem){// return middle;// }// return -1; return arr[middle] === elem ? middle : -1; } console.log(binarySearch([2,4,6,9,11,14,20,25,28,40],11)) output: 4
3.String Search
This not totally related to linear search or Binary search. I just wanted to add it here as it is very much important for searching string. A very simple way to search strings is given below:
Pseudocode:
- Loop Over the longer string.
- Loop Over the shorter string.
- If the characters don’t match,break out of the inner loop.
- If the characters match,Keep going.
- If you complete the inner loop and find a match, increment the count of matches.
- Return the count
function naiveSearch(str, word){ var count = 0; for(var i = 0; i < str.length; i++){ for(var j = 0; j < word.length; j++){ if(word[j] !== str[i+j]) break; if(j === word.length - 1) count++; } } return count; } console.log(naiveSearch("search and find", "an")) output: [7]
But this way we have to search each and every word from the begining in case of multiple data to search.The Knuth-Morris-Pratt string search algorithm or KMP algorithm is an algorithm for finding a substring within another string that uses information calculated about the substring to speed up the search process.Before looking through the text to be searched, a table of jump lengths is calculated from the search string. Rather than iterating through the whole string, you can determine how many characters to skip over based on the structure of the word being searched for. As you scan through the search text and you find a mismatch, you can start comparing characters again from the last matching substring between the search term and the search text.E.g. if you are searching through “Hello world how ya doin?” for the words “old world new”, every time you find an instance of ”old world new” that isn’t followed by a line-break, you can skip the characters of checking the string and restart from the beginning of your words, because you know that they can’t occur again within that span because you have already checked those and they don’t match the begging of the search words. Basically you already know that index is the next occurence of world, so you can start again from there.
// Construct a table with table[i] as the length of the longest prefix of the substring 0..i // create a table of size equal to the length of `str` // table[i] will store the prefix of the longest prefix of the substring str[0..i] // the longest prefix of the substring str[0] has length // for the substrings the following substrings, we have two cases // case 1. the current character doesn't match the last character of the longest prefix // if that is the case, we have to backtrack, and try find a character that will be equal to the current character // if we reach 0, then we couldn't find a chracter // case 2. The last character of the longest prefix matches the current character in `str` // if that is the case, we know that the longest prefix at position i has one more character. // for example consider `.` be any character not contained in the set [a.c] // str = abc....abc // consider `i` to be the last character `c` in `str` // maxPrefix = will be 2 (the first `c` in `str`) // maxPrefix now will be 3 // so the max prefix for table[9] is 3 function makeTable(str) { var table = new Array(str.length); var maxPrefix = 0; table[0] = 0; for (var i = 1; i < str.length; i++) { while (maxPrefix > 0 && str.charAt(i) !== str.charAt(maxPrefix)) { maxPrefix = table[maxPrefix - 1]; } if (str.charAt(maxPrefix) === str.charAt(i)) { maxPrefix++; } table[i] = maxPrefix; } return table; } // Find all the words that matches in a given string `str` // this algorithm is based on the Knuth–Morris–Pratt algorithm. Its beauty consists in that it performs the matching in O(n) // find the prefix table in O(n) // `j` is the index in `word(P)` // `i` is the index in `str(S)` // Case 1. S[i] == P[j] so we move to the next index in `S` and `P` // Case 2. `j` is equal to the length of `P` // that means that we reached the end of `P` and thus we found a match // Next we have to update `j` because we want to save some time // instead of updating to j = 0 , we can jump to the last character of the longest prefix well known so far. // j-1 means the last character of `P` because j is actually `P.length` // e.g. // S = a b a b d e // P = `a b`a b // we will jump to `a b` and we will compare d and a in the next iteration // a b a b `d` e // a b `a` b // Case 3. // S[i] != P[j] There's a mismatch! // if we have found at least a character in common, do the same thing as in case 2 // otherwise, j = 0, and we can move to the next character S[i+1]function kmpMatching(str, word) { var prefixes = makeTable(word); var matches = []; var j = 0; var i = 0; while (i < str.length) { if (str.charAt(i) === word.charAt(j)) { i++; j++; } if (j === word.length) { matches.push(i-j); j = prefixes[j-1]; } else if (str.charAt(i) !== word.charAt(j)) { if (j !== 0) { j = prefixes[j-1]; } else { i++; } } } return matches; } console.log(kmpMatching("it implies that it is very important","impo")) output: [27]
We can implement these algorithms in real world applications.Hope it helps.There are many other algorithms to use. But for a start these might help.
Happy coding…..