<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-697386284077671949</id><updated>2012-02-05T19:08:29.725-08:00</updated><category term='Ressources'/><category term='URL'/><category term='Outils'/><category term='Divers'/><title type='text'>自然</title><subtitle type='html'>Ce blog, comme un journal de bord, pour témoigner de l'évolution de notre projet (encadré) autour du terme 自然.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>15</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-2398340043147444032</id><published>2008-01-10T17:32:00.000-08:00</published><updated>2008-02-14T03:03:25.764-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Divers'/><title type='text'>Fin du projet</title><content type='html'>Ça y est. Le projet est fini.&lt;br /&gt;Le compte rendu final peut être consulté aux adresses suivantes :&lt;br /&gt;&lt;br /&gt;&lt;a href="http://pierre.inalco.free.fr/shizen/index.html"&gt;http://pierre.inalco.free.fr/shizen/index.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://tal.univ-paris3.fr/plurital/travaux-2007-2008/L7T04-2007-2008/Pierre-Marchal/index.html"&gt;http://tal.univ-paris3.fr/plurital/travaux-2007-2008/...&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A bientôt, pour de nouvelles aventures!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-2398340043147444032?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/2398340043147444032/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=2398340043147444032' title='1 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/2398340043147444032'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/2398340043147444032'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2008/01/fin-du-projet.html' title='Fin du projet'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-2647824266028496844</id><published>2007-12-17T09:18:00.000-08:00</published><updated>2008-12-10T06:05:08.731-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Outils'/><title type='text'>script.sh: amélioration</title><content type='html'>A partir des noms des fichiers d'URL, le script crée différents répertoires et y enregistre les pages aspirées, les dumps, les contextes.&lt;br /&gt;Seulement voilà : ces noms de fichiers sont utilisés tels quels, avec leurs extensions, et nous nous retrouvons avec des répertoires "quelque_chose.txt" dans notre arborescence.  &lt;br /&gt;Pour remédier à ce problème, nous allons utiliser la commande &lt;a href="http://en.wikipedia.org/wiki/Basename"&gt;basename&lt;/a&gt; avec la syntaxe suivante:&lt;br /&gt;&lt;br /&gt;&lt;blockquote font style="color:green;"&gt;&lt;span title="nouvelle variable"&gt;fic2&lt;/span&gt;=$(basename &lt;span title="nom du fichier"&gt;$fic&lt;/span&gt; &lt;span title="chaine de caractères finale à supprimer"&gt;.txt&lt;/span&gt;)&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Nous allons ensuite utiliser la chaine de caractères stockée dans la variable $fic2 pour nommer nos répertoires:&lt;br /&gt;&lt;br /&gt;&lt;blockquote font style="color:green;"&gt;mkdir ./pg_aspirees/$fic2&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;&lt;a title="avant" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_vjbmW2-heNI/R2fNp6Sw-8I/AAAAAAAAADE/hx-m21OOGME/s1600-h/02.png"&gt;&lt;img style="float:center; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_vjbmW2-heNI/R2fNp6Sw-8I/AAAAAAAAADE/hx-m21OOGME/s200/02.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5145307219200375746" /&gt;&lt;/a&gt;&lt;a title="après" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_vjbmW2-heNI/R2fOLqSw-9I/AAAAAAAAADM/93EbNm5UYrc/s1600-h/03.png"&gt;&lt;img style="float:center; margin:0 0 10px 10px;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_vjbmW2-heNI/R2fOLqSw-9I/AAAAAAAAADM/93EbNm5UYrc/s200/03.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5145307799020960722" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-2647824266028496844?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/2647824266028496844/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=2647824266028496844' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/2647824266028496844'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/2647824266028496844'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/12/scriptsh-amlioration.html' title='script.sh: amélioration'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_vjbmW2-heNI/R2fNp6Sw-8I/AAAAAAAAADE/hx-m21OOGME/s72-c/02.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-7215268436827830176</id><published>2007-11-30T06:32:00.000-08:00</published><updated>2008-12-10T06:05:10.332-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Outils'/><title type='text'>minigrepmultilingue 1.0</title><content type='html'>La dernière étape de notre projet (avant la mise en forme finale) consiste à extraire de nos fichiers dump, un motif et son contexte. Pour un texte en français ou en anglais la commande egrep aurait suffit, mais pour du japonais et du chinois, il nous fallait un équivalent supportant Unicode.&lt;br /&gt;Nous nous sommes donc tournés vers minigrepmultilingue 1.0, parfait pour ce que nous avons à faire. &lt;br /&gt;&lt;br /&gt; &lt;br /&gt;&lt;div style="text-align:center;"&gt;&lt;strong&gt;i. préparation&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;Télécharger et décompresser &lt;a href="http://www.cavi.univ-paris3.fr/ilpga/ilpga/tal/cours/minigrepmultilingue.zip"&gt;minigrepmultilingue.zip&lt;/a&gt;  (c'est l'archive contenant le script et le module Unicode-String-2.09) dans un répertoire que nous appellerons "minigrepmultilingue".&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_vjbmW2-heNI/R1AqvIyOG3I/AAAAAAAAAA8/CKY-bHwoK48/s1600-R/archive.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_vjbmW2-heNI/R1AqvIyOG3I/AAAAAAAAAA8/PPal_86j6l8/s200/archive.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5138654164130077554" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Démarrer le gestionnaire de paquets de Cygwin (setup) et vérifier que &lt;a href="http://fr.wikipedia.org/wiki/Make"&gt;make&lt;/a&gt; et &lt;a href="http://gcc.gnu.org/" title="GNU Compiler Collection"&gt;gcc&lt;/a&gt; sont bien installés (ils se trouvent dans la catégorie &lt;span style="font-style:italic;"&gt;Devel&lt;/span&gt;).&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_vjbmW2-heNI/R1BSH4yOHCI/AAAAAAAAACU/jjasSmFTMxo/s1600-R/cyg-setup.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_vjbmW2-heNI/R1BSH4yOHCI/AAAAAAAAACU/AkuaTN69GsE/s200/cyg-setup.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5138697470285323298" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Sous Cygwin, se placer dans le repertoire "minigrepmultilingue" et décompresser le module Unicode-String-2.09 :&lt;br /&gt;&lt;br /&gt;&lt;span style="color: green;"&gt;tar xzf Unicode-String-2.09.tar.gz&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_vjbmW2-heNI/R1A-xoyOG-I/AAAAAAAAAB0/9a6njm8z4RE/s1600-R/tar.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_vjbmW2-heNI/R1A-xoyOG-I/AAAAAAAAAB0/CldQbFDQnRU/s400/tar.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5138676197312306146" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align:center;"&gt;&lt;strong&gt;ii. compilation&lt;/strong&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;aller dans le dossier du module Unicode-String-2.09 et lancer le script Makefile.PL :&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_vjbmW2-heNI/R1A72YyOG7I/AAAAAAAAABc/0XU4CpcGFFU/s1600-R/perl.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://2.bp.blogspot.com/_vjbmW2-heNI/R1A72YyOG7I/AAAAAAAAABc/BOdaB-bsWqQ/s400/perl.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5138672980381801394" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;lancer la commande &lt;span style="color:green;"&gt;make&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_vjbmW2-heNI/R1A9PIyOG8I/AAAAAAAAABk/eQ3r9P4G_kI/s1600-R/make.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_vjbmW2-heNI/R1A9PIyOG8I/AAAAAAAAABk/UNFBzCJ_-cE/s400/make.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5138674505095191490" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;et enfin &lt;span style="color:green;"&gt;make test&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_vjbmW2-heNI/R1A-UoyOG9I/AAAAAAAAABs/b2Fl_8YRR0w/s1600-R/test.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_vjbmW2-heNI/R1A-UoyOG9I/AAAAAAAAABs/oF7b-R4MA-g/s400/test.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5138675699096099794" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align:center;"&gt;&lt;strong&gt;iii. installation&lt;/strong&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;pour l'installation, rien de très compliqué, il suffit juste de taper: &lt;span style="color:green;"&gt;make install&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_vjbmW2-heNI/R1BA5IyOG_I/AAAAAAAAAB8/bo0aVp-XIi0/s1600-R/install.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_vjbmW2-heNI/R1BA5IyOG_I/AAAAAAAAAB8/y38pyqQ51hs/s400/install.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5138678525184580594" /&gt;&lt;/a&gt; &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align:center;"&gt;&lt;strong&gt;iv. test&lt;/strong&gt;&lt;/div&gt;&lt;br /&gt;Pour s'assurer que tout s'est passé correctement, rien de tel qu'un petit test. Selon l'exemple fourni avec le script nous allons chercher le motif &lt;span style="color:red;"&gt;основных&lt;/span&gt; dans le fichier RU_Convention_UTF8.txt :&lt;br /&gt;&lt;br /&gt;&lt;span style="color:green;"&gt;perl mini-grep-multilingue.pl "UTF-8" RU_Convention_UTF8.txt motif.txt&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;En sortie nous avons un fichier html, ca marche!&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_vjbmW2-heNI/R1BKsIyOHBI/AAAAAAAAACM/LfE6IVlUo10/s1600-R/resultat.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_vjbmW2-heNI/R1BKsIyOHBI/AAAAAAAAACM/VetwnVu63ao/s200/resultat.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5138689296962558994" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Pour plus d'informations: &lt;a href="http://www.cavi.univ-paris3.fr/ilpga/ilpga/tal/cours/minigrepmultilingue.htm"&gt;http://www.cavi.univ-paris3.fr/ilpga/ilpga/...&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-7215268436827830176?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/7215268436827830176/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=7215268436827830176' title='5 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/7215268436827830176'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/7215268436827830176'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/11/installer-minigrepmultilingue-10.html' title='minigrepmultilingue 1.0'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_vjbmW2-heNI/R1AqvIyOG3I/AAAAAAAAAA8/PPal_86j6l8/s72-c/archive.JPG' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-3585813674861516597</id><published>2007-11-30T05:35:00.000-08:00</published><updated>2008-12-10T06:05:10.740-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Outils'/><title type='text'>lynx -dump</title><content type='html'>Une fois nos pages aspirées avec wget, nous devons en extraire le texte.&lt;br /&gt;Pour ce faire nous allons utiliser l'option dump de &lt;a href="http://lynx.isc.org/"&gt;lynx&lt;/a&gt;&lt;br /&gt;La syntaxe est la suivante:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;lynx -dump [URL] &gt; [dump].txt&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://w3m.sourceforge.net/index.en.html" title="world wide web wo miru"&gt;w3m&lt;/a&gt;, un autre navigateur internet en mode texte, permet de faire la même opération. Mais attention! w3m gérant les &lt;a href="http://en.wikipedia.org/wiki/Framing_%28World_Wide_Web%29"&gt;frames&lt;/a&gt; il arrive que le fichier dump en sortie soit incompréhensible (des morceaux de textes de différentes parties de la page pouvant être mélangés).&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_vjbmW2-heNI/R1iGxOAxgJI/AAAAAAAAACs/v94Ei-qoBoQ/s1600-h/lynx.png" title="lynx"&gt;&lt;img style="float:center; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_vjbmW2-heNI/R1iGxOAxgJI/AAAAAAAAACs/v94Ei-qoBoQ/s200/lynx.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5141007154776473746" /&gt;&lt;/a&gt; &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_vjbmW2-heNI/R1iHLOAxgKI/AAAAAAAAAC0/XyCo7gpsHoQ/s1600-h/w3m.png" title="w3m"&gt;&lt;img style="float:center; margin:0 10px 10px 0;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_vjbmW2-heNI/R1iHLOAxgKI/AAAAAAAAAC0/XyCo7gpsHoQ/s200/w3m.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5141007601453072546" /&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-3585813674861516597?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/3585813674861516597/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=3585813674861516597' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/3585813674861516597'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/3585813674861516597'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/11/lynx-dump.html' title='lynx -dump'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_vjbmW2-heNI/R1iGxOAxgJI/AAAAAAAAACs/v94Ei-qoBoQ/s72-c/lynx.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-6064386890101507112</id><published>2007-11-16T02:02:00.000-08:00</published><updated>2008-12-10T06:05:11.545-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Outils'/><title type='text'>%0D?</title><content type='html'>Mettre wget dans un script &lt;acronym title="Bourne-Again Shell"&gt;BASh&lt;/acronym&gt; devait permettre d'automatiser la tâche d'aspiration de mes pages. J'ai écrit un script qui parcourt chaque ligne (donc adresse) du fichier d'&lt;acronym title="Uniform Resource Locator"&gt;URL&lt;/acronym&gt;, extrait chacune des pages à l'aide de wget et, dans un tableau nouvellement créé, donne un lien vers la page &lt;span style="font-style: italic;"&gt;online&lt;/span&gt; et un autre vers la page aspirée.&lt;br /&gt;Tout allait pour le mieux dans le meilleur des mondes jusqu'à ce que je me rende compte que script.sh (c'est son petit nom) ne faisait pas ce que je lui avais demandé:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_vjbmW2-heNI/Rz1waIyOGxI/AAAAAAAAAAM/IaPWm8TDECc/s1600-h/404.JPG"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://2.bp.blogspot.com/_vjbmW2-heNI/Rz1waIyOGxI/AAAAAAAAAAM/IaPWm8TDECc/s200/404.JPG" alt="" id="BLOGGER_PHOTO_ID_5133382744609266450" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Que se passe t-il? J'ai beau vérifier mon script encore et encore, le réécrire de A à Z, essayer avec différents fichiers d'URL... wget n'arrive pas à trouver les pages que je lui demande d'aspirer.&lt;br /&gt;Mais alors que je regarde ma fenêtre &lt;a href="http://www.cygwin.com/"&gt;Cygwin&lt;/a&gt; d'un peu plus près, je remarque que les URL qui apparaissent dans mon script se terminent toutes par %0D... %0D! Ce serait pas un caractère de contrôle ça? Comme celui qui permet de faire un retour chariot? Je vérifie à nouveau le fichier texte contenant mes URL: comme je m'en doutais, %0D n'apparait nul part.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_vjbmW2-heNI/Rz12p4yOGyI/AAAAAAAAAAU/dgkQJUuUFpI/s1600-h/0D.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_vjbmW2-heNI/Rz12p4yOGyI/AAAAAAAAAAU/dgkQJUuUFpI/s200/0D.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5133389612261972770" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Tout devient clair! Les retours à la ligne de mes fichiers textes (tapés sous Windows XP) sont interprétés différemment par Cygwin (qui émule un système UNIX). Notepad++, entre autres, permet de régler ce problème en un seul click:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_vjbmW2-heNI/Rz15l4yOGzI/AAAAAAAAAAc/L_iijeKrlio/s1600-h/notepad.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://1.bp.blogspot.com/_vjbmW2-heNI/Rz15l4yOGzI/AAAAAAAAAAc/L_iijeKrlio/s200/notepad.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5133392842077379378" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Je peux maintenant relancer mon script et constater avec satisfaction que le problème est résolu:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_vjbmW2-heNI/Rz17noyOG0I/AAAAAAAAAAk/b_YVDNSGAnI/s1600-h/final.JPG"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://4.bp.blogspot.com/_vjbmW2-heNI/Rz17noyOG0I/AAAAAAAAAAk/b_YVDNSGAnI/s200/final.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5133395071165406018" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-6064386890101507112?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/6064386890101507112/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=6064386890101507112' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/6064386890101507112'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/6064386890101507112'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/11/0d.html' title='%0D?'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_vjbmW2-heNI/Rz1waIyOGxI/AAAAAAAAAAM/IaPWm8TDECc/s72-c/404.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-6393236918289732522</id><published>2007-11-14T13:07:00.000-08:00</published><updated>2007-11-14T13:15:50.900-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='URL'/><title type='text'>自然 pour exprimer la spontanéité</title><content type='html'>URL_25: &lt;a href="http://golf.nikkei.co.jp/news_ps/index.cfm?i=20070314ge000ge"&gt;http://golf.nikkei.co.jp/...&lt;/a&gt;&lt;br /&gt;URL_26: &lt;a href="http://sweetacorn.blog108.fc2.com/blog-entry-117.html"&gt;http://sweetacorn.blog108.fc2.com/...&lt;/a&gt;&lt;br /&gt;URL_27: &lt;a href="http://contents.innolife.net/news/list.php?ac_id=4&amp;ai_id=77450"&gt;http://contents.innolife.net/...&lt;/a&gt;&lt;br /&gt;URL_28: &lt;a href="http://www.city.takashima.shiga.jp/icity/browser?ActionCode=content&amp;ContentID=1187222126407&amp;SiteID=0"&gt;http://www.city.takashima.shiga.jp/...&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Fichier texte contenant les URL: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/05_spontaneite.txt"&gt;05_spontaneite.txt&lt;/a&gt;&lt;br /&gt;Pages aspirées avec wget: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/05_spontaneite/"&gt;&lt;05_spontaneite&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-6393236918289732522?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/6393236918289732522/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=6393236918289732522' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/6393236918289732522'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/6393236918289732522'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/11/pour-exprimer-la-spontanit.html' title='自然 pour exprimer la spontanéité'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-4679423217607970589</id><published>2007-11-14T12:26:00.000-08:00</published><updated>2007-11-14T12:54:13.050-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='URL'/><title type='text'>自然 pour exprimer un enchaînement logique</title><content type='html'>URL_21: &lt;a href="http://www.orsj.or.jp/~wiki/wiki/index.php/%E8%87%AA%E7%84%B6%E9%81%B8%E6%8A%9E"&gt;http://www.orsj.or.jp/~wiki/...&lt;/a&gt;&lt;br /&gt;URL_22: &lt;a href="http://blog.livedoor.jp/markzu/archives/50818594.html"&gt;http://blog.livedoor.jp/...&lt;/a&gt;&lt;br /&gt;URL_23: &lt;a href="http://bizplus.nikkei.co.jp/keiki/body.cfm?i=20070919kk000kk&amp;p=1"&gt;http://bizplus.nikkei.co.jp/...&lt;/a&gt;&lt;br /&gt;URL_24: &lt;a href="http://www.toonippo.co.jp/news_kyo/news/20071028010002661.asp"&gt;http://www.toonippo.co.jp/news_kyo/...&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Fichier texte contenant les URL: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/04_enchainement_logique.txt"&gt;04_enchainement_logique.txt&lt;/a&gt;&lt;br /&gt;Pages aspirées avec wget: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/04_enchainement_logique/"&gt;&lt;04_enchainement_logique&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-4679423217607970589?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/4679423217607970589/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=4679423217607970589' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/4679423217607970589'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/4679423217607970589'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/11/pour-exprimer-un-enchanement-logique.html' title='自然 pour exprimer un enchaînement logique'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-2308433005972551476</id><published>2007-11-11T05:18:00.000-08:00</published><updated>2007-11-11T05:29:25.150-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='URL'/><title type='text'>自然 pour désigner des forces spirituelles</title><content type='html'>URL_18: &lt;a href="http://www.posteios.com/PROJ_D_HON053.htm"&gt;http://www.posteios.com/...&lt;/a&gt;&lt;br /&gt;URL_19: &lt;a href="http://blog.livedoor.jp/niida55/archives/50776127.html"&gt;http://blog.livedoor.jp/niida55/...&lt;/a&gt;&lt;br /&gt;URL_20: &lt;a href="http://www.junkudo.co.jp/detail2.jsp?ID=0103268612"&gt;http://www.junkudo.co.jp/...&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Fichier texte contenant les URL: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/03_forces_spirituelles.txt"&gt;03_forces_spirituelles.txt&lt;/a&gt;&lt;br /&gt;Pages aspirées avec wget: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/03_forces_spirituelles/"&gt;&lt;03_forces_spirituelles&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-2308433005972551476?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/2308433005972551476/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=2308433005972551476' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/2308433005972551476'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/2308433005972551476'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/11/pour-dsigner-des-forces-spirituelles.html' title='自然 pour désigner des forces spirituelles'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-8127417828275224944</id><published>2007-11-07T14:09:00.000-08:00</published><updated>2007-11-08T02:56:20.568-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='URL'/><title type='text'>自然 pour désigner l'état brut, sauvage de quelque chose</title><content type='html'>URL_11: &lt;a href="http://www.netsea.jp/shop/6638/naturalgemS"&gt;http://www.netsea.jp/shop/...&lt;/a&gt;&lt;br /&gt;URL_12: &lt;a href="http://www.ne.jp/asahi/014/584/kakou/sio.html"&gt;http://www.ne.jp/asahi/...&lt;/a&gt;&lt;br /&gt;URL_13: &lt;a href="http://sumai.nikkei.co.jp/special/health/news.cfm?i=20040824t2010t2"&gt;http://sumai.nikkei.co.jp/special/...&lt;/a&gt;&lt;br /&gt;URL_14: &lt;a href="http://www.geocities.jp/omorigold/A9_1.htm"&gt;http://www.geocities.jp/omorigold/...&lt;/a&gt;&lt;br /&gt;URL_15: &lt;a href="http://www.h5.dion.ne.jp/%7Enspicnic/mine/sample/copperKUSAMA.htm"&gt;http://www.h5.dion.ne.jp/...&lt;/a&gt;&lt;br /&gt;URL_16: &lt;a href="http://www.nikkei.co.jp/china/news/20070904d2m0401d04.html"&gt;http://www.nikkei.co.jp/china/...&lt;/a&gt;&lt;br /&gt;URL_17: &lt;a href="http://sumai.nikkei.co.jp/style/frontier/38.cfm"&gt;http://sumai.nikkei.co.jp/...&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Fichier texte contenant les URL: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/02_brut_sauvage.txt"&gt;02_brut_sauvage.txt&lt;/a&gt;&lt;br /&gt;Pages aspirées avec wget: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/02_brut_sauvage/"&gt;&lt;02_brut_sauvage&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-8127417828275224944?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/8127417828275224944/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=8127417828275224944' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/8127417828275224944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/8127417828275224944'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/11/pour-dsigner-ltat-brut-sauvage-de.html' title='自然 pour désigner l&apos;état brut, sauvage de quelque chose'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-1256674599414914980</id><published>2007-11-01T09:03:00.000-07:00</published><updated>2007-11-01T11:30:42.506-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='URL'/><title type='text'>自然 pour désigner notre écosystème</title><content type='html'>URL_01: &lt;a href="http://mytown.asahi.com/shimane/news.php?k_id=33000180710220001"&gt;http://mytown.asahi.com/...&lt;/a&gt;&lt;br /&gt;URL_02: &lt;a href="http://waga.nikkei.co.jp/play/leisure.aspx"&gt;http://waga.nikkei.co.jp/...&lt;/a&gt;&lt;br /&gt;URL_03: &lt;a href="http://www.akita-kenmin.jp/maseken/shizenhogo.1.html"&gt;http://www.akita-kenmin.jp/...&lt;/a&gt;&lt;br /&gt;URL_04: &lt;a href="http://www.town.hayakawa.yamanashi.jp/2000/genre/keyword/mf_nature.html"&gt;http://www.town.hayakawa.yamanashi.jp/...&lt;/a&gt;&lt;br /&gt;URL_05: &lt;a href="http://www.peopledaily.co.jp/j/2000/12/29/jp20001229_1015.html"&gt;http://www.peopledaily.co.jp/...&lt;/a&gt;&lt;br /&gt;URL_06: &lt;a href="http://mytown.asahi.com/tokushima/news.php?k_id=37000000710140003"&gt;http://mytown.asahi.com/...&lt;/a&gt;&lt;br /&gt;URL_07: &lt;a href="http://www.pref.okinawa.jp/kodomo/bunka/d8_gyoji.html"&gt;http://www.pref.okinawa.jp/...&lt;/a&gt;&lt;br /&gt;URL_08: &lt;a href="http://www3.ic-net.or.jp/%7Eyaguchi/houwa/2000nen.htm"&gt;http://www3.ic-net.or.jp/...&lt;/a&gt;&lt;br /&gt;URL_09: &lt;a href="http://sumai.nikkei.co.jp/style/gardening/12_1.cfm"&gt;http://sumai.nikkei.co.jp/...&lt;/a&gt;&lt;br /&gt;URL_10: &lt;a href="http://mytown.asahi.com/okayama/news.php?k_id=34000000710290002"&gt;http://mytown.asahi.com/...&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Fichier texte contenant les URL: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/01_ecosysteme.txt"&gt;01_ecosysteme.txt&lt;/a&gt;&lt;br /&gt;Pages aspirées avec wget: &lt;a href="http://pierre.inalco.free.fr/projet_encadre/01_ecosysteme/"&gt;&lt;01_ecosysteme&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-1256674599414914980?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/1256674599414914980/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=1256674599414914980' title='2 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/1256674599414914980'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/1256674599414914980'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/11/dsignant-notre-cosystme.html' title='自然 pour désigner notre écosystème'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-1568193075041883724</id><published>2007-10-30T14:27:00.000-07:00</published><updated>2007-10-30T14:38:01.515-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Ressources'/><title type='text'>Définition Wikipédia Japon</title><content type='html'>&lt;p&gt;&lt;b&gt;自然&lt;/b&gt;（&lt;b&gt;しぜん&lt;/b&gt;）には次のような意味がある。&lt;/p&gt; &lt;ol&gt;&lt;li&gt;人間の作為が加わっていない、あるがままの状態、現象、およびそれによる生成物。&lt;/li&gt;&lt;li&gt;1の意味より、山、川、海など。&lt;/li&gt;&lt;li&gt;1の意味より、「&lt;b&gt;人間を除く&lt;/b&gt;」自然物および生物全般のこと。&lt;/li&gt;&lt;li&gt;1の意味より、&lt;b&gt;「ヒトも含めた&lt;/b&gt;&lt;span style="color: rgb(153, 153, 153);"&gt;[1]&lt;/span&gt;&lt;b&gt;」&lt;/b&gt;天地・宇宙の万物のこと&lt;/li&gt;&lt;li&gt;意識（意図）しない行動。本能による現象、行動のこと。 &lt;/li&gt;&lt;/ol&gt;&lt;span style="color: rgb(153, 153, 153);"&gt;[1]&lt;/span&gt; 「生物としてのヒト」は人工物ではない、という考え方より&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: right;"&gt;&lt;a href="http://ja.wikipedia.org/wiki/%E8%87%AA%E7%84%B6"&gt;[suite]&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-1568193075041883724?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/1568193075041883724/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=1568193075041883724' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/1568193075041883724'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/1568193075041883724'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/10/dfinition-wikipdia-japon.html' title='Définition Wikipédia Japon'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-3548302249972954863</id><published>2007-10-28T10:18:00.000-07:00</published><updated>2007-11-01T10:33:42.858-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Outils'/><title type='text'>De la concaténation de deux pages web</title><content type='html'>En fouillant un peu dans la documentation de wget, nous sommes tombés sur l'option &lt;span style="font-style: italic;"&gt;-O&lt;/span&gt;, qui permet d'enregistrer &lt;span style="text-decoration: underline;"&gt;toutes&lt;/span&gt; les pages aspirées dans un seul fichier html.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;wget -i  [fichier texte contenant les URL] -O [fichier html de sortie]&lt;/blockquote&gt;&lt;br /&gt;Cette concaténation peut donner des résultats intéressants, comme en témoigne l'exemple ci-dessous.&lt;br /&gt;&lt;br /&gt;En entrée:&lt;a href="http://headlines.yahoo.co.jp/hl?a=20071022-00000122-mailo-l20"&gt; &lt;/a&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://headlines.yahoo.co.jp/hl?a=20071022-00000122-mailo-l20"&gt;URL_1&lt;/a&gt;: japonais (JAP-EUC)&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.ne.jp/asahi/014/584/kakou/sio.html"&gt;URL_2&lt;/a&gt;: japonais (Shift_JIS)&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;En sortie:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://pierre.inalco.free.fr/projet_encadre/concatenation.html"&gt;URL_3&lt;/a&gt;: une page web en japonais avec deux codages différents.&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-3548302249972954863?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/3548302249972954863/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=3548302249972954863' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/3548302249972954863'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/3548302249972954863'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/10/de-la-concatnation-de-deux-pages-web.html' title='De la concaténation de deux pages web'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-5522195061637359039</id><published>2007-10-25T08:03:00.000-07:00</published><updated>2007-10-25T13:21:05.202-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Outils'/><title type='text'>wget, va chercher!</title><content type='html'>Après avoir créé un fichier texte contenant les URL (et seulement ces URL) des pages qui nous intéressent nous pouvons passer à l'étape qui consiste à "aspirer" ces mêmes pages. Nous allons  pour ce faire utiliser la commande/aspirateur &lt;a href="http://fr.wikipedia.org/wiki/Wget"&gt;wget&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Deux méthodes:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Pour aspirer manuellement les pages, l'une après l'autre (il faut le vouloir), la syntaxe est la suivante: &lt;blockquote&gt;wget [URL]&lt;/blockquote&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;Pour aspirer toutes les pages d'une seule ligne de commande: &lt;blockquote&gt;wget -i [fichier texte contenant les URL]&lt;/blockquote&gt;&lt;/li&gt;&lt;/ul&gt;*à noter que les pages extraites sont enregistrées dans le repertoire courant.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-5522195061637359039?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/5522195061637359039/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=5522195061637359039' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/5522195061637359039'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/5522195061637359039'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/10/va-chercher-wget.html' title='wget, va chercher!'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-489034164590324655</id><published>2007-10-22T12:41:00.000-07:00</published><updated>2007-10-31T10:08:36.903-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Ressources'/><title type='text'>Définition Gendai Shinkokugojiten (Gakken)</title><content type='html'>&lt;strong&gt;し・ぜん&lt;/strong&gt;【自然】&lt;br /&gt;&lt;br /&gt;&lt;div style="text-align: left;"&gt;《名》①人間の手を加えない、そのもののありのままの状態。天然。「いつまでも&lt;strong&gt;―&lt;/strong&gt;の姿を保っている原野」「&lt;strong&gt;―&lt;/strong&gt;の驚異」「&lt;strong&gt;―&lt;/strong&gt;を愛する」　&lt;strong&gt;類&lt;/strong&gt;　原始。造化。万物。万象。万有。森羅万象。山川草木。花鳥風月。②人や物に本来そなわっている性質。&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div style="text-align: left;"&gt;《形動》①ありのままであるようす。むりがないようす。「&lt;strong&gt;&lt;span&gt;―&lt;/span&gt;&lt;/strong&gt;な動き」②ひとりでにそうなるようす。おのずと。「&lt;strong&gt;―&lt;/strong&gt;に戸が開く」「&lt;strong&gt;&lt;span&gt;―&lt;/span&gt;&lt;/strong&gt;に体がよくなる」&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;《副》ひとりでに。おのずと。「&lt;strong&gt;―&lt;/strong&gt;、笑みがこぼれる」&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;学研　現代新国語辞典&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-489034164590324655?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/489034164590324655/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=489034164590324655' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/489034164590324655'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/489034164590324655'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/10/dfinition-1.html' title='Définition Gendai Shinkokugojiten (Gakken)'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-697386284077671949.post-3746441947148774095</id><published>2007-10-22T11:07:00.000-07:00</published><updated>2007-10-25T13:13:31.473-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Divers'/><title type='text'>Le blog est ouvert</title><content type='html'>...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/697386284077671949-3746441947148774095?l=shizen-ziran.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://shizen-ziran.blogspot.com/feeds/3746441947148774095/comments/default' title='Publier les commentaires'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=697386284077671949&amp;postID=3746441947148774095' title='0 commentaires'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/3746441947148774095'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/697386284077671949/posts/default/3746441947148774095'/><link rel='alternate' type='text/html' href='http://shizen-ziran.blogspot.com/2007/10/le-blog-est-ouvert.html' title='Le blog est ouvert'/><author><name>Pierre</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
