A genetic programming framework to schedule webpage updates
The quality of a Web search engine is influenced by several factors, including coverage and the freshness of the content gathered by the web crawler. Focusing particularly on freshness, one key challenge is to estimate the likelihood of a previously crawled webpage being modified. Such estimates are...
Saved in:
Published in: | Information retrieval (Boston) Vol. 18; no. 1; pp. 73 - 94 |
---|---|
Main Authors: | , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
Dordrecht
Springer Netherlands
01-02-2015
Springer Nature B.V |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Abstract | The quality of a Web search engine is influenced by several factors, including coverage and the freshness of the content gathered by the web crawler. Focusing particularly on freshness, one key challenge is to estimate the likelihood of a previously crawled webpage being modified. Such estimates are used to define the order in which those pages should be visited, and thus, can be exploited to reduce the cost of monitoring crawled webpages for keeping updated versions. We here present a Genetic Programming framework, called
G
P
4
C
—
Genetic Programming for Crawling
, to generate score functions that produce accurate rankings of pages regarding their probabilities of having been modified. We compare
G
P
4
C
with state-of-the-art methods using a large dataset of webpages crawled from the Brazilian Web. Our evaluation includes multiple performance metrics and several variations of our framework, built from exploring different sets of terminals and fitness functions. In particular, we evaluate
G
P
4
C
using the ChangeRate and Normalized Discounted Cumulative Gain (NDCG) metrics as both objective function and evaluation metric. We show that, in comparison with ChangeRate, NDCG has the ability of better evaluating the effectiveness of scheduling strategies, since it is able to take the
ranking
produced by the scheduling into account. |
---|---|
AbstractList | (ProQuest: ... denotes formulae and/or non-USASCII text omitted; see image) The quality of a Web search engine is influenced by several factors, including coverage and the freshness of the content gathered by the web crawler. Focusing particularly on freshness, one key challenge is to estimate the likelihood of a previously crawled webpage being modified. Such estimates are used to define the order in which those pages should be visited, and thus, can be exploited to reduce the cost of monitoring crawled webpages for keeping updated versions. We here present a Genetic Programming framework, called ...--Genetic Programming for Crawling, to generate score functions that produce accurate rankings of pages regarding their probabilities of having been modified. We compare ... with state-of-the-art methods using a large dataset of webpages crawled from the Brazilian Web. Our evaluation includes multiple performance metrics and several variations of our framework, built from exploring different sets of terminals and fitness functions. In particular, we evaluate ... using the ChangeRate and Normalized Discounted Cumulative Gain (NDCG) metrics as both objective function and evaluation metric. We show that, in comparison with ChangeRate, NDCG has the ability of better evaluating the effectiveness of scheduling strategies, since it is able to take the ranking produced by the scheduling into account. The quality of a Web search engine is influenced by several factors, including coverage and the freshness of the content gathered by the web crawler. Focusing particularly on freshness, one key challenge is to estimate the likelihood of a previously crawled webpage being modified. Such estimates are used to define the order in which those pages should be visited, and thus, can be exploited to reduce the cost of monitoring crawled webpages for keeping updated versions. We here present a Genetic Programming framework, called G P 4 C — Genetic Programming for Crawling , to generate score functions that produce accurate rankings of pages regarding their probabilities of having been modified. We compare G P 4 C with state-of-the-art methods using a large dataset of webpages crawled from the Brazilian Web. Our evaluation includes multiple performance metrics and several variations of our framework, built from exploring different sets of terminals and fitness functions. In particular, we evaluate G P 4 C using the ChangeRate and Normalized Discounted Cumulative Gain (NDCG) metrics as both objective function and evaluation metric. We show that, in comparison with ChangeRate, NDCG has the ability of better evaluating the effectiveness of scheduling strategies, since it is able to take the ranking produced by the scheduling into account. |
Author | Almeida, Jussara M. de Moura, Edleno S. da Silva, Altigran S. Santos, Aécio S. R. de Carvalho, Cristiano R. Ziviani, Nivio |
Author_xml | – sequence: 1 givenname: Aécio S. R. surname: Santos fullname: Santos, Aécio S. R. organization: Department of Computer Science, Federal University of Minas Gerais, Zunnit Technologies – sequence: 2 givenname: Cristiano R. surname: de Carvalho fullname: de Carvalho, Cristiano R. organization: Department of Computer Science, Federal University of Minas Gerais – sequence: 3 givenname: Jussara M. surname: Almeida fullname: Almeida, Jussara M. email: jussara@dcc.ufmg.br organization: Department of Computer Science, Federal University of Minas Gerais – sequence: 4 givenname: Edleno S. surname: de Moura fullname: de Moura, Edleno S. organization: Institute of Computing, Federal University of Amazonas – sequence: 5 givenname: Altigran S. surname: da Silva fullname: da Silva, Altigran S. organization: Institute of Computing, Federal University of Amazonas – sequence: 6 givenname: Nivio surname: Ziviani fullname: Ziviani, Nivio organization: Department of Computer Science, Federal University of Minas Gerais, Zunnit Technologies |
BookMark | eNp1kD1PwzAQhi1UJNrCD2CLxGw4O_6KmKqKL6kSC8yW41xCS5sEO1HFv8dVGFiY7obnfe_0LMis7Vok5JrBLQPQd5GBLhgFJmjBhaHyjMyZ1DnVShaztOdGUSGVuCCLGHcAoIQo5uR-lTXY4rD1WR-6JrjDYds2WZ0WPHbhMxu6LPoPrMY9Zkcse9dgNvaVGzBekvPa7SNe_c4leX98eFs_083r08t6taFeCBgod6zilealZ6VDzvO61hJLYKYQJjfeFEr5Kk-U8rz2RnqhJOMOoJTSVJAvyc3Umz78GjEOdteNoU0nLTNS5RpYIRPFJsqHLsaAte3D9uDCt2VgT47s5MgmR_bkyJ4yfMrExLYNhj_N_4Z-ABqeaoo |
CitedBy_id | crossref_primary_10_3390_s18010016 crossref_primary_10_1016_j_eswa_2019_112915 crossref_primary_10_1016_j_knosys_2022_110126 crossref_primary_10_1007_s10796_016_9701_7 crossref_primary_10_14778_3229863_3240490 crossref_primary_10_29130_dubited_1097123 crossref_primary_10_2339_politeknik_1347054 |
Cites_doi | 10.1145/582415.582418 10.1109/TKDE.2004.1269663 10.1145/857166.857170 10.1002/asi.22665 10.1002/asi.20009 10.1561/1500000017 10.1007/s10791-005-6991-7 10.1145/1852102.1852103 10.1016/j.is.2008.07.003 10.1007/978-3-319-02432-5_30 10.1016/B978-155860869-6/50052-4 10.1145/1277741.1277810 10.1145/335191.335391 10.1145/2433396.2433448 10.1145/1571941.1572041 10.1002/(SICI)1099-1425(199806)1:1<15::AID-JOS3>3.0.CO;2-K 10.1007/978-3-642-24583-1_23 |
ContentType | Journal Article |
Copyright | Springer Science+Business Media New York 2014 Springer Science+Business Media New York 2014. |
Copyright_xml | – notice: Springer Science+Business Media New York 2014 – notice: Springer Science+Business Media New York 2014. |
DBID | AAYXX CITATION 3V. 7SC 7WY 7WZ 7XB 87Z 88I 8AL 8AO 8FD 8FE 8FG 8FK 8FL ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ HCIFZ JQ2 K60 K6~ K7- L.- L7M L~C L~D M0C M0N M2P P5Z P62 PQBIZ PQBZA PQEST PQQKQ PQUKI PYYUZ Q9U |
DOI | 10.1007/s10791-014-9248-5 |
DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Collection Science Database (Alumni Edition) Computing Database (Alumni Edition) ProQuest Pharma Collection Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni Edition) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Business Premium Collection Technology Collection ProQuest One Community College ProQuest Central Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student SciTech Premium Collection (Proquest) (PQ_SDU_P3) ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global Computing Database Science Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest One Business ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ABI/INFORM Collection China ProQuest Central Basic |
DatabaseTitle | CrossRef ABI/INFORM Global (Corporate) ProQuest Business Collection (Alumni Edition) ProQuest One Business Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Pharma Collection ABI/INFORM Complete ProQuest Central ABI/INFORM Professional Advanced ProQuest Central Korea Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global ProQuest Computing ProQuest Science Journals (Alumni Edition) ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Science Journals ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ABI/INFORM China ProQuest Technology Collection ProQuest SciTech Collection ProQuest Business Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Business (Alumni) ProQuest One Academic ProQuest Central (Alumni) Business Premium Collection (Alumni) |
DatabaseTitleList | ABI/INFORM Global (Corporate) |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Library & Information Science Computer Science |
EISSN | 1573-7659 |
EndPage | 94 |
ExternalDocumentID | 4296169351 10_1007_s10791_014_9248_5 |
GroupedDBID | -59 -5G -BR -EM -Y2 -~C .4I .86 .DC .VR 06D 0R~ 0VY 199 1N0 1SB 203 29I 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5VS 67Z 6NX 78A 7WY 88I 8AO 8FE 8FG 8FL 8FW 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AAFGU AAHNG AAIAL AAJKR AANZL AARHV AARTL AATNV AATVU AAUYE AAWCG AAYFA AAYIU AAYQN AAYTO ABBBX ABBXA ABDBF ABDZT ABECU ABFGW ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKAS ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACBMV ACBRV ACBXY ACBYP ACGFS ACGOD ACHSB ACHXU ACIGE ACIPQ ACKNC ACMDZ ACMLO ACOKC ACOMO ACSNA ACTTH ACVWB ACWMK ADHHG ADHIR ADINQ ADKNI ADKPE ADMDM ADOXG ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFTE AEGAL AEGNC AEJHL AEJRE AEKMD AENEX AEOHA AEPYU AESKC AESTI AETLH AEVLU AEVTX AEXYK AFGCZ AFKRA AFLOW AFNRJ AFQWF AFWTZ AFZKB AGAYW AGDGC AGGBP AGGDS AGJBK AGMZJ AGQMX AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIIXL AILAN AIMYW AITGF AJBLW AJDOV AJRNO AJZVZ AKQUC ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN AZQEC B-. BA0 BDATZ BENPR BEZIV BGLVJ BGNMA BPHCQ CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO EBLON EBS EIOEI EJD ELW ESBYG F5P FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GROUPED_ABI_INFORM_RESEARCH GXS HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I-F I09 IHE IJ- IKXTQ IWAJR IXC IXD IXE IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV LAK LLZTM M0C M0N M2P M4Y MA- N2Q NB0 NPVJJ NQJWS NU0 O9- O93 O9J OAM OVD P2P P62 P9O PF0 PQBIZ PQQKQ PROAC PT4 PT5 Q2X QOS R89 R9I RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S27 S3B SAP SCO SDH SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 TEORI TSG TSK TSV TUC U2A UG4 UNUBA UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK8 YLTOR Z45 Z7X Z7Z Z81 Z83 Z88 ZMTXR AAYXX ABAKF ACZOJ AEFQL AFBBN AGQEE AGRTI C6C CITATION H13 PQBZA 7SC 7XB 8AL 8FD 8FK JQ2 L.- L7M L~C L~D PQEST PQUKI Q9U |
ID | FETCH-LOGICAL-c440t-2a1d2d72bc1bae223ff75eb01894838c8966cd3a1d6c2fc85c46512a00b558d03 |
IEDL.DBID | AEJHL |
ISSN | 1386-4564 |
IngestDate | Tue Nov 19 05:52:14 EST 2024 Thu Nov 21 21:34:14 EST 2024 Wed Jan 03 01:20:54 EST 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Keywords | Scheduling functions Genetic Programming Web crawling |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c440t-2a1d2d72bc1bae223ff75eb01894838c8966cd3a1d6c2fc85c46512a00b558d03 |
PQID | 1856370195 |
PQPubID | 26106 |
PageCount | 22 |
ParticipantIDs | proquest_journals_1856370195 crossref_primary_10_1007_s10791_014_9248_5 springer_journals_10_1007_s10791_014_9248_5 |
PublicationCentury | 2000 |
PublicationDate | 2015-02-01 |
PublicationDateYYYYMMDD | 2015-02-01 |
PublicationDate_xml | – month: 02 year: 2015 text: 2015-02-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | Dordrecht |
PublicationPlace_xml | – name: Dordrecht |
PublicationTitle | Information retrieval (Boston) |
PublicationTitleAbbrev | Inf Retrieval J |
PublicationYear | 2015 |
Publisher | Springer Netherlands Springer Nature B.V |
Publisher_xml | – name: Springer Netherlands – name: Springer Nature B.V |
References | Tan, Mitra (CR22) 2010; 28 CR2 Fan, Gordon, Pathak (CR12) 2004; 16 CR4 CR5 Fan, Fox, Pathak, Wu (CR10) 2004; 55 CR8 CR19 CR9 Koza (CR17) 1992 CR14 CR13 CR11 CR20 Jain (CR15) 1991 Olston, Najork (CR18) 2010; 4 Trotman (CR23) 2005; 8 da Costa Carvalho, Rossi, de Moura, da Silva, Fernandes (CR6) 2012; 63 Järvelin, Kekäläinen (CR16) 2002; 20 Cho, Garcia-Molina (CR3) 2003; 3 Silva, de Moura, Cavalcanti, da Silva, de Carvalho, Gonçalves (CR21) 2009; 34 Carvalho, Rossi, de Moura, Fernandes, da Silva (CR1) 2012; 55 W Fan (9248_CR10) 2004; 55 Q Tan (9248_CR22) 2010; 28 W Fan (9248_CR12) 2004; 16 9248_CR20 R Jain (9248_CR15) 1991 C Olston (9248_CR18) 2010; 4 9248_CR8 9248_CR14 J Cho (9248_CR3) 2003; 3 9248_CR5 9248_CR4 9248_CR11 9248_CR2 9248_CR13 JR Koza (9248_CR17) 1992 K Järvelin (9248_CR16) 2002; 20 TPC Silva (9248_CR21) 2009; 34 A Carvalho (9248_CR1) 2012; 55 A Trotman (9248_CR23) 2005; 8 AL da Costa Carvalho (9248_CR6) 2012; 63 9248_CR19 9248_CR9 |
References_xml | – ident: CR19 – volume: 20 start-page: 422 issue: 4 year: 2002 end-page: 446 ident: CR16 article-title: Cumulated gain-based evaluation of ir techniques publication-title: ACM Transactions on Information Systems doi: 10.1145/582415.582418 contributor: fullname: Kekäläinen – year: 1991 ident: CR15 publication-title: The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling contributor: fullname: Jain – ident: CR4 – ident: CR14 – year: 1992 ident: CR17 publication-title: Genetic Programming: On the Programming of Computers by Means of Natural Selection contributor: fullname: Koza – ident: CR2 – volume: 16 start-page: 523 issue: 4 year: 2004 end-page: 527 ident: CR12 article-title: Discovery of context-specific ranking functions for effective information retrieval using genetic programming publication-title: IEEE Transactions on Knowledge and Data Engineering doi: 10.1109/TKDE.2004.1269663 contributor: fullname: Pathak – volume: 3 start-page: 256 year: 2003 end-page: 290 ident: CR3 article-title: Estimating frequency of change publication-title: ACM Transactions on Internet Technology doi: 10.1145/857166.857170 contributor: fullname: Garcia-Molina – ident: CR13 – ident: CR11 – ident: CR9 – volume: 63 start-page: 1383 issue: 7 year: 2012 end-page: 1397 ident: CR6 article-title: Lepref: Learn to precompute evidence fusion for efficient query evaluation publication-title: Journal of the American Society for Information Science and Technology doi: 10.1002/asi.22665 contributor: fullname: Fernandes – ident: CR5 – volume: 55 start-page: 628 issue: 7 year: 2004 end-page: 636 ident: CR10 article-title: The effects of fitness functions on genetic programming-based ranking discovery for web search publication-title: Journal of the American Society for Information Science and Technology doi: 10.1002/asi.20009 contributor: fullname: Wu – volume: 4 start-page: 175 issue: 3 year: 2010 end-page: 246 ident: CR18 article-title: Web crawling publication-title: Foundations and Trends in Information Retrieval doi: 10.1561/1500000017 contributor: fullname: Najork – volume: 8 start-page: 359 issue: 3 year: 2005 end-page: 381 ident: CR23 article-title: Learning to rank publication-title: Information Retrieval doi: 10.1007/s10791-005-6991-7 contributor: fullname: Trotman – volume: 55 start-page: 1 issue: 92 year: 2012 end-page: 28 ident: CR1 article-title: LePrEF: Learn to Pre-compute Evidence Fusion for Efficient Query Evaluation publication-title: Journal of the American Society for Information Science and Technology contributor: fullname: da Silva – ident: CR8 – volume: 28 start-page: 17:1 year: 2010 end-page: 17:27 ident: CR22 article-title: Clustering-based incremental web crawling publication-title: ACM Transactions on Information Systems doi: 10.1145/1852102.1852103 contributor: fullname: Mitra – ident: CR20 – volume: 34 start-page: 276 year: 2009 end-page: 289 ident: CR21 article-title: An evolutionary approach for combining different sources of evidence in search engines publication-title: Information Systems doi: 10.1016/j.is.2008.07.003 contributor: fullname: Gonçalves – volume: 16 start-page: 523 issue: 4 year: 2004 ident: 9248_CR12 publication-title: IEEE Transactions on Knowledge and Data Engineering doi: 10.1109/TKDE.2004.1269663 contributor: fullname: W Fan – volume: 28 start-page: 17:1 year: 2010 ident: 9248_CR22 publication-title: ACM Transactions on Information Systems doi: 10.1145/1852102.1852103 contributor: fullname: Q Tan – volume: 3 start-page: 256 year: 2003 ident: 9248_CR3 publication-title: ACM Transactions on Internet Technology doi: 10.1145/857166.857170 contributor: fullname: J Cho – volume: 20 start-page: 422 issue: 4 year: 2002 ident: 9248_CR16 publication-title: ACM Transactions on Information Systems doi: 10.1145/582415.582418 contributor: fullname: K Järvelin – ident: 9248_CR20 doi: 10.1007/978-3-319-02432-5_30 – ident: 9248_CR4 doi: 10.1016/B978-155860869-6/50052-4 – volume: 55 start-page: 1 issue: 92 year: 2012 ident: 9248_CR1 publication-title: Journal of the American Society for Information Science and Technology contributor: fullname: A Carvalho – volume: 55 start-page: 628 issue: 7 year: 2004 ident: 9248_CR10 publication-title: Journal of the American Society for Information Science and Technology doi: 10.1002/asi.20009 contributor: fullname: W Fan – ident: 9248_CR11 – volume-title: The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling year: 1991 ident: 9248_CR15 contributor: fullname: R Jain – ident: 9248_CR8 doi: 10.1145/1277741.1277810 – ident: 9248_CR2 doi: 10.1145/335191.335391 – ident: 9248_CR19 doi: 10.1145/2433396.2433448 – ident: 9248_CR13 doi: 10.1145/1571941.1572041 – volume: 8 start-page: 359 issue: 3 year: 2005 ident: 9248_CR23 publication-title: Information Retrieval doi: 10.1007/s10791-005-6991-7 contributor: fullname: A Trotman – volume: 34 start-page: 276 year: 2009 ident: 9248_CR21 publication-title: Information Systems doi: 10.1016/j.is.2008.07.003 contributor: fullname: TPC Silva – ident: 9248_CR5 doi: 10.1002/(SICI)1099-1425(199806)1:1<15::AID-JOS3>3.0.CO;2-K – volume: 63 start-page: 1383 issue: 7 year: 2012 ident: 9248_CR6 publication-title: Journal of the American Society for Information Science and Technology doi: 10.1002/asi.22665 contributor: fullname: AL da Costa Carvalho – ident: 9248_CR9 – ident: 9248_CR14 doi: 10.1007/978-3-642-24583-1_23 – volume: 4 start-page: 175 issue: 3 year: 2010 ident: 9248_CR18 publication-title: Foundations and Trends in Information Retrieval doi: 10.1561/1500000017 contributor: fullname: C Olston – volume-title: Genetic Programming: On the Programming of Computers by Means of Natural Selection year: 1992 ident: 9248_CR17 contributor: fullname: JR Koza |
SSID | ssj0006449 |
Score | 2.1288292 |
Snippet | The quality of a Web search engine is influenced by several factors, including coverage and the freshness of the content gathered by the web crawler. Focusing... (ProQuest: ... denotes formulae and/or non-USASCII text omitted; see image) The quality of a Web search engine is influenced by several factors, including... |
SourceID | proquest crossref springer |
SourceType | Aggregation Database Publisher |
StartPage | 73 |
SubjectTerms | Computer Science Data Mining and Knowledge Discovery Data Structures and Information Theory Freshness Genetic algorithms Image quality Information retrieval Information Storage and Retrieval Internet Machine learning Mathematical programming Natural Language Processing (NLP) Pattern Recognition Performance evaluation Performance measurement Programming Scheduling Search engines Studies |
Title | A genetic programming framework to schedule webpage updates |
URI | https://link.springer.com/article/10.1007/s10791-014-9248-5 https://www.proquest.com/docview/1856370195 |
Volume | 18 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED5Bu8BAoYAotMgDYgAZJU6cOGKqoKUgxEKR2KL4xQCkFW3_P-c0aQuCAaYMfii6O_u-s-8-A5zYLAuUMh6NmeQ0jLmmMnJAToRaxzLkTLoz3cFj_PAsrnuOJoctji7y14vqRrLYqFdq3WKXpOOHFEMGQfk61NH1cLTterd3N7hf7L_o4ZMizBIRdWQp1V3mT5N89UZLiPntVrRwNv3Gf35zG7ZKaEm6c1vYgTWTN6FRPdtAylXchM0VDsImdMrKBXJKytIkp6qq9y5cdgnamCt1JGUu1zuOI7ZK6iLTEcEIGT3WmyEFV9OLIbOxO0iY7MFTvze8GtDyxQWqwtCbUpb5mmlUm_JlZhA5WBtzIz1fJKEIhBIYHCkdYK9IMasEV-4pdZZ5nuRcaC_Yh1o-ys0BEI64MtZZplmgQq6SROmEM2u1NUJayVpwVkk-Hc-JNdIlhbITYopCTJ0QU96CdqWbtFxjkxSRRhQ4NnlsPq-UsdL822SHf-p9BBuIkfg8UbsNtenHzHRgfaJnx6Xhue_t8Kb_CTyb028 |
link.rule.ids | 315,782,786,27933,27934,41073,42142,48344,48347,48357,49649,49652,49662,52153 |
linkProvider | Springer Nature |
linkToHtml | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1NTwIxEJ0IHNSDKGpEQXswHjRNlm67240noiBE5CIm3jbbLy8KROD_2y5bQaMHPXe22cxMO2_amVeAc5NloZQ6wDERDNOYKSwiB-Q4VSoWlBHhznR7j_Hwmd92HE1O6Hth8mp3fyWZ79RrzW6xq9JpUWxzBo5ZCSo0iah15Uq7P7rrfm7ANsQneZ7FI-zYUvxl5k-TfA1HK4z57Vo0jzbd6r_-cxd2CnCJ2ktv2IMNPa5B1T_cgIp1XIPtNRbCGjSL3gV0gYrmJGcsL70P121kvcw1O6KimuvNfoeML-tC8wmyObKNWa8a5WxNLxotpu4oYXYAT93O6KaHizcXsKQ0mGOStRRR1nCyJTJtsYMxMdMiaPGE8pBLbtMjqUIrFUliJGfSPaZOsiAQjHEVhIdQHk_G-ggQs8gyVlmmSCgpk0kiVcKIMcpoLowgdbj0qk-nS2qNdEWi7JSYWiWmTokpq0PDGyctVtkstVgjCh2fvB2-8sZYG_5tsuM_SZ_BZm_0MEgH_eH9CWxZxMSWZdsNKM_fF7oJpZlanBZe-AFUYdXR |
linkToPdf | http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSwMxEB60BdGD1apYrZqDeFCC22zSzeJBirbUByKo4G3ZvLxoW2z7_520G1tFD-J1kw3LzGTnm2TmG4BDl-ex1jaiCVOC8kQYqpoeyEluTKK4YMqf6XYfkrtnedn2NDnnoRZmku0eriSnNQ2epak3Oh0YdzpX-Jb4jJ0Gpxg_SCoWoczxGRp6udW-7t5-_ozR3aeTmEs2qWdOCRebPy3y1TXN8Oa3K9KJ5-lU_v3Na7BagE7SmlrJOizYXhUqoaEDKfZ3FVbm2AmrsFfUNJAjUhQteSWG2Rtw1iJofb4IkhRZXm_4HnEh3YuM-gRjZ_Rlr5ZMWJxeLBkP_BHDcBOeOu3Hiy4tejFQzXk0oixvGGZQobqhcouYwrlEWBU1ZMplLLXEsEmbGGc1NXNaCu2brLM8ipQQ0kTxFpR6_Z7dBiIQcSYmzw2LNRc6TbVJBXPOOCuVU6wGx0EN2WBKuZHNyJW9EDMUYuaFmIka1IOismL3DTPEIM3Y88zj8ElQzNzwb4vt_Gn2ASzdX3ay26u7m11YRiAlptncdSiN3sd2DxaHZrxfGOQHXu3elA |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+genetic+programming+framework+to+schedule+webpage+updates&rft.jtitle=Information+retrieval+%28Boston%29&rft.au=Santos%2C+A%C3%A9cio+S.+R.&rft.au=de+Carvalho%2C+Cristiano+R.&rft.au=Almeida%2C+Jussara+M.&rft.au=de+Moura%2C+Edleno+S.&rft.date=2015-02-01&rft.pub=Springer+Netherlands&rft.issn=1386-4564&rft.eissn=1573-7659&rft.volume=18&rft.issue=1&rft.spage=73&rft.epage=94&rft_id=info:doi/10.1007%2Fs10791-014-9248-5&rft.externalDocID=10_1007_s10791_014_9248_5 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1386-4564&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1386-4564&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1386-4564&client=summon |