ひまつぶし: 2010/08

2010年8月31日火曜日

Redmineをインストールしたい！その1

…と何度となくやっているため，もうどのパッケージが必要でインストールしたものなのか訳わかめ。参考リンク：Debian GNU/Linux LennyへのRails 2.0導入メモ - 発声練習
で，だ。なぜかrubygemsもインストールされている訳で，この時点で上記リンクと状況は異なっている……
で，もう少し調べてると，こんなところ：全自動ねじまき機: Debian lenny + apache2 でredmineを動かすその1があったり。lennyならばRailsもaptitudeで大丈夫？？
rubyパッケージっぽいものをpurgeしてから，作業開始。

Backportsの設定

railsのバージョンが，

lenny	lenny-backports	squeeze
2.1.0-7	2.3.5-1~bpo50+1	2.3.5-1

とのことなので，backportsからインストールしてみましょう。

apt/sources.list の設定

以下を追加

deb http://www.jp.backports.org lenny-backports main

apt/preference の設定

以下を追加

Package: *
Pin: release a=lenny-backports
Pin-Priority: 200

で，最初に$ sudo aptitude updateすると，

W: GPG error: http://www.jp.backports.org lenny-backports Release: 公開鍵を利用できないため、以下の署名は検証できませんでした: NO_PUBKEY EA8E8B2116BA136C
W: これらの問題を解決するためには apt-get update を実行する必要があるかもしれません

などといわれるので，debian-backports-keyringをインストールする。

$ sudo aptitude install debian-backports-keyring
　：

警告: 以下のパッケージは信頼できないバージョンがインストールされます!

信頼できないパッケージはシステムのセキュリティを危うくする可能性があります。
自分がこのインストールを望んでいると確信できる場合のみ、インストールを先に進め
てください。

debian-backports-keyring

この警告を無視して意地でも先に進みますか?
先に進む場合は "Yes" を、中断する場合は "No" を、入力してください:

…恐ろしいことをおっしゃる。まあインストールせなならんのでYes。
今日はここまで。

2010年8月23日月曜日

Access：テキストファイルへのエクスポート

なんで「地域と言語のオプション」コントロールパネルになんぞ依存しているの？爆発！

2010年8月9日月曜日

DokuWiki-2009-12-25c その3

あっぷでーと

件のメッセージはconf/msgが更新されれば変わるはず……。かわらへん。なんで？

お使いの DokuWiki をアップグレードして conf/msg ファイル内の数字が大きくなった場合でも、なおアップデートのお知らせが表示され続けることがあります。これは DokuWiki のキャッシュによるものです。DokuWiki は取得したアップデートのお知らせをキャッシュしますが、お知らせの再取得は、前回の取得から 1 日経過した場合とキャッシュの更新日時よりも conf/msg ファイルの更新日時のほうが新しい場合だけしか行われない仕組みとなっています。古いアップデートのお知らせを表示させなくするには、単純に 1 日待つか、conf/msg ファイルの更新日時を touch コマンド1)などで変更するか、もしくは data/cache/messages.txt ファイルを削除するようにしてください。

というわけで，ここにちゃんと書いてあった。 data/cache/message.txtを削除したらちゃんと消えました。

DokuWiki-2009-12-25c その2

indexer.php（2009-12-25）

12-25版。mecabのために手を入れてた部分は関係なさそう。


--- indexer.2009-02-14.php      2010-08-09 14:36:23.000000000 +0900
+++ indexer.2009-12-25.php      2010-08-09 14:36:40.000000000 +0900
@@ -74,7 +74,7 @@
         fwrite($fh,$line);
     }
     fclose($fh);
-    if($conf['fperm']) chmod($fn.'.tmp', $conf['fperm']);
+    if(isset($conf['fperm'])) chmod($fn.'.tmp', $conf['fperm']);
     io_rename($fn.'.tmp', $fn.'.idx');
     return true;
 }
@@ -574,12 +574,16 @@

     // merge found pages into final result array
     $final = array();
-    foreach(array_keys($result) as $word){
+    foreach($result as $word => $res){
         $final[$word] = array();
-        foreach($result[$word] as $wid){
+        foreach($res as $wid){
             $hits = &$docs[$wid];
             foreach ($hits as $hitkey => $hitcnt) {
-                $final[$word][$hitkey] = $hitcnt + $final[$word][$hitkey];
+                if (!isset($final[$word][$hitkey])) {
+                    $final[$word][$hitkey] = $hitcnt;
+                } else {
+                    $final[$word][$hitkey] += $hitcnt;
+                }
             }
         }
     }
@@ -664,7 +668,9 @@
     if (empty($page_idx)) return;
     $pagewords = array();
     $len = count($page_idx);
-    for ($n=0;$n<$len;$n++) $pagewords[] = array();
+    for ($n=0;$n<$len;$n++){
+        $pagewords[] = array();
+    }
     unset($page_idx);

     $n=0;

indexer.php（2009-12-25mecab対応）


--- indexer.2009-12-25.php      2010-08-09 15:03:07.000000000 +0900
+++ indexer.2009-12-25.mod.php     2010-08-09 15:13:33.000000000 +0900
@@ -45,6 +45,8 @@
                    ']?');
 define('IDX_ASIAN', '(?:'.IDX_ASIAN1.'|'.IDX_ASIAN2.'|'.IDX_ASIAN3.')');

+define('PRE_TOKENIZER', '/usr/bin/mecab -O wakati');
+
 /**
  * Measure the length of a string.
  * Differs from strlen in handling of asian characters.
@@ -52,11 +54,16 @@
  * @author Tom N Harris <tnharris@whoopdedo.org>
  */
 function wordlen($w){
-    $l = strlen($w);
+    //$l = strlen($w);
+    $l = utf8_strlen($w);
+
+    /*
     // If left alone, all chinese "words" will get put into w3.idx
     // So the "length" of a "word" is faked
     if(preg_match('/'.IDX_ASIAN2.'/u',$w))
         $l += ord($w) - 0xE1;  // Lead bytes from 0xE2-0xEF
+     */
+
     return $l;
 }

@@ -220,6 +227,28 @@

     list($page,$body) = $data;

+     if(function_exists(proc_open) && defined('PRE_TOKENIZER')) {
+         $dspec = array(
+            0 => array("pipe", "r"),
+            1 => array("pipe", "w"),
+            2 => array("file", "/dev/null", "w")
+        );
+        $process = proc_open(PRE_TOKENIZER, $dspec, $pipes);
+        if(is_resource($process)) {
+            stream_set_blocking($pipes[0], FALSE);
+            stream_set_blocking($pipes[1], FALSE);
+            fwrite($pipes[0], $body . "\n");
+            fclose($pipes[0]);
+
+            $body = '';
+            while(!feof($pipes[1])) {
+                $body .= fgets($pipes[1], 32768);
+            }
+            fclose($pipes[1]);
+            proc_close($process);
+        }
+    }
+
     $body   = strtr($body, "\r\n\t", '   ');
     $tokens = explode(' ', $body);
     $tokens = array_count_values($tokens);   // count the frequency of each token
@@ -489,7 +518,8 @@
             $wild |= 2;
             $wlen -= 1;
         }
-        if ($wlen < IDX_MINWORDLENGTH && $wild == 0 && !is_numeric($xword)) continue;
+        //if ($wlen < IDX_MINWORDLENGTH && $wild == 0 && !is_numeric($xword)) continue;
+        if (preg_match('/[^0-9A-Za-z]/u', $string) && $wlen < IDX_MINWORDLENGTH && $wild
== 0 && !is_numeric($xword)) continue;
         if(!isset($tokens[$xword])){
             $tokenlength[$wlen][] = $xword;
         }
@@ -632,12 +662,36 @@
  */
 function idx_tokenizer($string,&$stopwords,$wc=false){
     $words = array();
+
+    if(function_exists(proc_open) && defined('PRE_TOKENIZER')) {
+        $dspec = array(
+            0 => array("pipe", "r"),
+            1 => array("pipe", "w"),
+            2 => array("file", "/dev/null", "w")
+        );
+        $process = proc_open(PRE_TOKENIZER, $dspec, $pipes);
+        if(is_resource($process)) {
+            stream_set_blocking($pipes[0], FALSE);
+            stream_set_blocking($pipes[1], FALSE);
+            fwrite($pipes[0], $string . "\n");
+            fclose($pipes[0]);
+            $string = '';
+            while(!feof($pipes[1])) {
+                $string .= fgets($pipes[1], 32768);
+            }
+            fclose($pipes[1]);
+            proc_close($process);
+        }
+    }
+
     $wc = ($wc) ? '' : $wc = '\*';

     if(preg_match('/[^0-9A-Za-z]/u', $string)){
+        /*
         // handle asian chars as single words (may fail on older PHP version)
         $asia = @preg_replace('/('.IDX_ASIAN.')/u',' \1 ',$string);
         if(!is_null($asia)) $string = $asia; //recover from regexp failure
+        */

         $arr = explode(' ', utf8_stripspecials($string,' ','\._\-:'.$wc));
         foreach ($arr as $w) {

つづく

DokuWiki-2009-12-25c その1

ことのおこり

これがそろそろうざったいのでDokuWikiをバージョンアップしようかなと思い立つ。

現在使用中はdokuwiki-2009-02-14b。あいだにdokuwiki-2009-12-02ってのがあったらしい。すんなりいけばいいな

indexer.php（2009-02-14mecab対応）

2009-02-14のときに手を入れた部分（02-14bは数ファイルのみの更新だった）


--- indexer.php.org    2009-02-14 21:13:24.000000000 +0900
+++ indexer.php.mod    2009-04-23 16:17:13.000000000 +0900
@@ -45,6 +45,8 @@
                    ']?');
 define('IDX_ASIAN', '(?:'.IDX_ASIAN1.'|'.IDX_ASIAN2.'|'.IDX_ASIAN3.')');

+define('PRE_TOKENIZER', '/usr/bin/mecab -O wakati');
+
 /**
  * Measure the length of a string.
  * Differs from strlen in handling of asian characters.
@@ -52,11 +54,16 @@
  * @author Tom N Harris <tnharris@whoopdedo.org>
  */
 function wordlen($w){
-    $l = strlen($w);
+    //$l = strlen($w);
+    $l = utf8_strlen($w);
+
+    /*
     // If left alone, all chinese "words" will get put into w3.idx
     // So the "length" of a "word" is faked
     if(preg_match('/'.IDX_ASIAN2.'/u',$w))
         $l += ord($w) - 0xE1;  // Lead bytes from 0xE2-0xEF
+     */
+
     return $l;
 }

@@ -220,6 +227,28 @@

     list($page,$body) = $data;

+     if(function_exists(proc_open) && defined('PRE_TOKENIZER')) {
+         $dspec = array(
+            0 => array("pipe", "r"),
+            1 => array("pipe", "w"),
+            2 => array("file", "/dev/null", "w")
+        );
+        $process = proc_open(PRE_TOKENIZER, $dspec, $pipes);
+        if(is_resource($process)) {
+            stream_set_blocking($pipes[0], FALSE);
+            stream_set_blocking($pipes[1], FALSE);
+            fwrite($pipes[0], $body . "\n");
+            fclose($pipes[0]);
+
+            $body = '';
+            while(!feof($pipes[1])) {
+                $body .= fgets($pipes[1], 32768);
+            }
+            fclose($pipes[1]);
+            proc_close($process);
+        }
+    }
+
     $body   = strtr($body, "\r\n\t", '   ');
     $tokens = explode(' ', $body);
     $tokens = array_count_values($tokens);   // count the frequency of each token
@@ -489,7 +518,8 @@
             $wild |= 2;
             $wlen -= 1;
         }
-        if ($wlen < IDX_MINWORDLENGTH && $wild == 0 && !is_numeric($xword)) continue;
+        //if ($wlen < IDX_MINWORDLENGTH && $wild == 0 && !is_numeric($xword)) continue;
+        if (preg_match('/[^0-9A-Za-z]/u', $string) && $wlen < IDX_MINWORDLENGTH && $wild == 0 && !is_numeric($xword)) continue;
         if(!isset($tokens[$xword])){
             $tokenlength[$wlen][] = $xword;
         }
@@ -628,12 +658,36 @@
  */
 function idx_tokenizer($string,&$stopwords,$wc=false){
     $words = array();
+
+    if(function_exists(proc_open) && defined('PRE_TOKENIZER')) {
+        $dspec = array(
+            0 => array("pipe", "r"),
+            1 => array("pipe", "w"),
+            2 => array("file", "/dev/null", "w")
+        );
+        $process = proc_open(PRE_TOKENIZER, $dspec, $pipes);
+        if(is_resource($process)) {
+            stream_set_blocking($pipes[0], FALSE);
+            stream_set_blocking($pipes[1], FALSE);
+            fwrite($pipes[0], $string . "\n");
+            fclose($pipes[0]);
+            $string = '';
+            while(!feof($pipes[1])) {
+                $string .= fgets($pipes[1], 32768);
+            }
+            fclose($pipes[1]);
+            proc_close($process);
+        }
+    }
+
     $wc = ($wc) ? '' : $wc = '\*';

     if(preg_match('/[^0-9A-Za-z]/u', $string)){
+        /*
         // handle asian chars as single words (may fail on older PHP version)
         $asia = @preg_replace('/('.IDX_ASIAN.')/u',' \1 ',$string);
         if(!is_null($asia)) $string = $asia; //recover from regexp failure
+        */

         $arr = explode(' ', utf8_stripspecials($string,' ','\._\-:'.$wc));
         foreach ($arr as $w) {

Debianでmecabをaptでインストールすると，/usr/bin/mecabにインストールされるけど，ほかの環境の人はここを変えないと。

つづく

ひまつぶし

2010年8月31日火曜日

Redmineをインストールしたい！その1

Backportsの設定

apt/sources.list の設定

apt/preference の設定

2010年8月23日月曜日

Access：テキストファイルへのエクスポート

2010年8月9日月曜日

DokuWiki-2009-12-25c その3

あっぷでーと

DokuWiki-2009-12-25c その2

indexer.php（2009-12-25）

indexer.php（2009-12-25mecab対応）

DokuWiki-2009-12-25c その1

ことのおこり

indexer.php（2009-02-14mecab対応）

わるさんとは

twitter

ラベル

アーカイブ

ひまつぶし

2010年8月31日火曜日

Redmineをインストールしたい！ その1

Backportsの設定

apt/sources.list の設定

apt/preference の設定

2010年8月23日月曜日

Access：テキストファイルへのエクスポート

2010年8月9日月曜日

DokuWiki-2009-12-25c その3

あっぷでーと

DokuWiki-2009-12-25c その2

indexer.php（2009-12-25）

indexer.php（2009-12-25mecab対応）

DokuWiki-2009-12-25c その1

ことのおこり

indexer.php（2009-02-14mecab対応）

わるさんとは

twitter

ラベル

アーカイブ

Redmineをインストールしたい！その1