You can use Live Search for smaller and medium websites. A database is not needed. The website will be crawled from the defined baseurl. The links will be collected and the content cached - so future searches are faster. Almost found search results are stored in session files too to increase search speed.
The indexed content is saved inside the cache directory (one textfile for each URL)
The textparts with the found searchstrings are cropped in the search results so only the part with the searchstring is displayed (like displaying on Google)
Various options help you to handle Live Search. A Search Word Cloud is available too.
Keywords, Descriptions and Images can be searched for since V2.0 too.
Since V3.0 you're able to choose caching method (curl or allow_url_fopen) and you're able to search within PDF files.
Since V3.3 you can add additional external hosts if they are linked on your website and wished to include them into your search results.
Since V3.4 you can define logical correlations (AND/OR) inside the config for searching multiple words
Since V3.7 there's a Dashboard and some improvements were made
LS 3.6 | ~ 21:00 Minutes |
LS 3.65 (unreleased) | ~ 14:00 Minutes |
LS 3.7 url_fget with checking for 404 | ~ 14:00 Minutes |
LS 3.7 url_fget without checking for 404 | ~ 9:30 Minutes |
LS 3.7 curl | ~ 12:00 Minutes |
just a small bugfix (allow_url_fopen and curl pre-checks)
Just post any form with your searchfield and use as form action i.e. search.php
upload ls-folder to your webproject - the ls-folder contains livesearch.class.php, livesearch.css, icon-pdf.png, cache
Include livesearch.class.php and initialize Class on every page where you want to use the Live Search functions
i.e.
<?php
include("ls/livesearch.class.php");
$LiveSearch = new LiveSearch();
?>
The baselink from where the grabbing should start
var $baseurl = "http://www.homac.at/";
The absolute directory path on your webserver for your $baseurl ONLY needed for the GD-thumbnail generation
var $basepath = "/users/mac/www/www.homac.at/htdocs/";
The URL to your ls directory ONLY needed to display the GD generated thumbnails
var $lsurl = "http://www.homac.at/ls/";
The name of your search results page to prevent from endless loops and pagination (within the $basedir path)
var $searchresultspage = "search.php";
Exclude paths or individual files under the $basedir from being checked for links, works with PDF files too (since V3.0)
var $excl = array("dont_index",
Include individual files under the $basedir which aren't linked anywhere on your site,
"hideme.html",
"private/",
"docs/invoice.pdf",
);
for example hallo.txt isn't linked anywhere on demo page, but content can be found, works with PDF files too (since V3.0)
var $incl = array("hallo.txt",
new since V3.3 - List of external hosts/domains
"data.pdf",
);
array() or array("List","of","domains")
Domains/Hosts of external linked pages or embedded external images
Just the host - no URLs no Protocols
Examples
var $additionalHosts = array("www.anywhereelse.com","flickr.com");
new since V3.0 - Method of sitegrabbing
auto, curl or url_fopen - if you're using auto curl will be tried before url_fopen will be adducted
var $method = "auto";
Extensions for grabbing links
var $checkext = array("htm","html","php","txt");
new since V3.6 - save URLs to each cached file into seperate files next to content file (1.txt → 1.url, 2.txt → 2.url) for debugging purposes (true/false)
var $saveURLfiles = false;
new since V3.0 - If you like to search within PDF files set this variable true (mind the requirements)
var $collect_pdfs = true;
new since V3.0 - the extensions of pdf files (usually it's just pdf :) )
var $pdfext = array("pdf");
Hours between caching processes (-1 for caching every search process could be okay for smaller dynamic content sites)
var $cachetime = 12; //-1 for caching every access
Results per page, if more results are found you can use the pagination function
var $srch_res_per_page = 15;
new since V3.4 - logical combination if searching for multiple words at once
OR ... splits to words and shows results containing ANY of the words
AND ... splits to words and shows results containing ALL of the words
false ... doesn't split and shows results for the WHOLE string as it is
var $srch_logic = "OR";
new since V3.0 (bootstrap since 3.2, bootstrap3 since 3.3) - the style of the pagination
currently there are to styles avail (default, boxed, bootstrap, bootstrap3) - will be just used in the draw...methods
bootstrap uses Bootstrap 2.3.2 CSS styles
bootstrap3 uses Bootstrap 3.1.1 CSS styles
var $pagerstyle = "boxed";
Min and max fontsize for the SearchCloud (px)
var $cloud_min = 10;
var $cloud_max = 45;
var $maxCloudItems = 50;
new since V3.2 - Errormessages for query string length or no results (only íf you use the drawSearchResults method)
var $errorToShort = '<div class="alert alert-error">You have to enter at least %1$s characters.</div>';
var $errorNothingFound = '<div class="alert alert-info">No search results for %1$s.</div>';
If you're running into performance-troubles on greater websites (timeout during caching, memory exhaustion ...) you should set this value to true, otherwise leave it false
var $performance_fix = false;
If you like to search for images too (filename, alt-tag, title-tag) set this variable true
var $collect_images = true;
The headline for your search results
var $img_results_headline = '%1$s Images for %2$s'; //Number of images, Searchstring
new since V3.0 The headline for your search result
var $img_result_headline = '1 Image for %1$s'; //Searchstring
The height of GD generated images - !!! in the livesearch.css there're are height definitions ltoo for CSS-thumbs !!!
var $thumb_height = 70;
If true the thmubnails will be genereated automatically with the help of the GD-Library, otherwise the images will be sized by CSS
var $create_thumbs = true;
UTF decoding for searching - if needed (true/false) - enabled by default
var $utf8DecodeResults = true;
Cache directory - have to be writeable, in the example below the path to cache directory will be calculated automatically relative to livesearch.class.php
$this->cachedir = realpath(dirname(__FILE__)) . "/cache";
Just caching the files without searching - this action could take a while an will be called automatically while search process if no files are cached or the age of the cached files is older than the defined $cachetime
$LiveSearch->cacheFiles();
necessary to initiate the search, if no files are cached or the age of the cached files is older than the defined $cachetime the cacheFiles function will be called by the search function too
$LiveSearch->search($_REQUEST["q"],$_REQUEST["p"]);
or, if you like to design the results by yourself (you will get an array)
$searchresults = $LiveSearch->search($_REQUEST["q"],$_REQUEST["p"]);
Array
(
[0] => Array
(
[0] => Array
(
[src] => http://ls.envato.homac.at/images/gravatar.jpg
[title] =>
[alt] => image
[parenturl] => http://ls.envato.homac.at/index.php
[GDThumb] => d9b023be3750db3cfbdcc72f0e71cc65.jpg
)
[title] => 1 Images for avatar
[url] => #
[content] => <a href="http://ls.envato.homac.at/images/gravatar.jpg"><img src="http://ls.envato.homac.at/ls/cache/thumbs/d9b023be3750db3cfbdcc72f0e71cc65.jpg" alt="image" title=""></a>
)
[1] => Array
(
[url] => http://ls.envato.homac.at/help.php
[title] => LiveSearch - How it works
[content] => ... Links (i.e. &action=search) - don't forget the leading & Current cloud for this website: <strong class="highlight">avatar</strong> search easter image super firefox keywords duper <strong class="highlight">avatar</strong>search wise Excluding blocks from being indexed since V 1.3 you're able to exclude/hide blocks...
)
)
After a successfull search you have access to some variables
Function to delete all cached files. Note: cached files will be deleted automatically if they are too old ($cachetime exceeded) or on every other caching process (-1)
$LiveSearch->clearCache();
Function to remove stored search results
$LiveSearch->clearSrch();
Function to remove stored searchstrings (used by the Search Word Cloud)
$LiveSearch->clearSrchStr();
Returns paging information after search was successfull and results are more than the defined $srch_res_per_page, with these information you could build your own pagination
$LiveSearch->pager();
Returns array with the following keys:
Returns an example output for the pagination if results are more than the defined $srch_res_per_page and will be called in the $LiveSearch->drawSearchresults() method too.
$LiveSearch->drawPagination();
or
$LiveSearch->drawPagination("p","q")
or
$LiveSearch->drawPagination("p","q","&action=search")
or, new since V3.0
$LiveSearch->drawPagination("p","q","&action=search","boxed")
Syntax
$LiveSearch->drawPagination([PageVarName], [SearchStringVarName], [Add2Query], [PagerStyle])
An example output for the search results, including the pagination from above
$LiveSearch->drawSearchresults();
or
$LiveSearch->drawSearchresults("p","q")
$LiveSearch->drawSearchresults("p","q","&action=search")
Syntax
$LiveSearch->drawSearchresults([PageVarName], [SearchStringVarName], [Add2Query])
Shows you the collected and cached Urls
$LiveSearch->showUrls();
output for this website:
Shows you the Search Word Cloud
$LiveSearch->printSrchCloud()
or
$LiveSearch->printSrchCloud("q")
or, new since V1.1
$LiveSearch->printSrchCloud("q","&action=search")
Syntax
$LiveSearch->printSrchCloud([SearchStringVarName], [Add2Query])
since V 1.3 you're able to exclude/hide blocks from your website from LiveSearch by setting simple comment tags. This makes sense for menues on every page
Start hiding
<!--LSHIDE-->
Stop hiding
<!--/LSHIDE-->
Examples
On this example page the Ciao Codecanyon part on the index page can't be found. (ciao too :) )
other Examples
#1
Some words, can be found but <!--LSHIDE-->this combination can't be<!--/LSHIDE--> found
#2
blabla
<!--LSHIDE-->
Mainmenue #1
Mainmenue #2
Mainmenue #3
<!--/LSHIDE-->
some text ...
<!--LSHIDE-->
Submenue #1
Submenue #2
<!--/LSHIDE-->
...
additionaly to the LSHIDE-blocks you can hide images from being indexed since V 2.0 by setting a class called LSHIDE to your images
These are some sample usage codes
<img src='images/icons/contact.gif' alt='contact' class='icon LSHIDE' /> <!--won't be indexed-->
<img src='images/space.png' alt='' class='lshide' /> <!--won't be indexed-->
<img src="images/portfolio/homac.jpg" alt="homac" title="Homac e.U." class="float-left p5" /> <!--will be indexed-->
<img src="images/portfolio/envazo.jpg" alt=""" /> <!--will be indexed-->
With version 3.7 the item was equipped with an small administration interface which shows you some status information and allows you to use some of the methods directly inside this interface.
To access the LiveSeach Manager point your browser to lsMngr.php of your ls directory on the webserver, example: http://www.mysite.com/ls/lsMngr.php
The user credentials can (have to) be set in the class-File directly
var $mngrUser = "admin";
var $mngrPass = ""; //please choose your password
<form method="post" action="search.php">
<input type="text" name="q">
<input type="submit">
</form>
<?php
include("ls/livesearch.class.php");
$LiveSearch = new LiveSearch();
?>
...
<?php
$LiveSearch->search($_REQUEST["q"],$_REQUEST["p"]);
echo "<p>" . $LiveSearch->drawSearchresults() . "</p>";
?>
...
<?php
include("ls/livesearch.class.php");
$LiveSearch = new LiveSearch();
?>
...
<?php
$search_results = $LiveSearch->search($_REQUEST["q"],$_REQUEST["p"]);
echo "Found: " . $LiveSearch->searchcount;
echo "Pages: " . $LiveSearch->pages;
echo "Current Page: " . $LiveSearch->p;
echo "<pre><b>Search Results</b><code>" .
print_r($search_results,true) . "</code><pre>";
?>
...