[an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] (none) [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive] (none) [an error occurred while processing this directive] [an error occurred while processing this directive] [an error occurred while processing this directive][an error occurred while processing this directive]
 
[an error occurred while processing this directive] [an error occurred while processing this directive]
Skåne Sjælland Linux User Group - http://www.sslug.dk Home   Subscribe   Mail Archive   Forum   Calendar   Search
MhonArc Date: [Date Prev] [Date Index] [Date Next]   Thread: [Date Prev] [Thread Index] [Date Next]   MhonArc
 

Perl regex+ memory management



Hej!

Jeg har et problem hvor jeg læser ca. $N=100 regex'er fra en
fil, og anvender dem til at substitutere tekst i $P=1000
filer, specifikt at indsætte links for ord, der matcher regexerne.

Jeg indlæser filerne i et objekt $Pathtml for at have lidt
kontrol over sagerne. $regex[$i] er en streng (ikke compilet regex).
Nedenstående kode virker fint. Bortset fra at det er tåbeligt
fordi jeg læser og skriver samme fil $N gange.

Et snip fra den relevante del af koden:

:
:

for (my $i=0; $i<$N; $i++)      #loop over regexes
{
   for (my $j=0; $j<$P; $j++)   #loop over files
       {
          $p=Pathtml->new($fil[j]);

          $u1 = "<a href=\"$root/$path/$i\">";
          $u2 = "</a>";

          $a=$p->getTitle();
          $count_title = $a=~/$regex[$i]/i;

          if ($count_title!=0)
             {
                $a =~ s|($regex[$i])|$u1$1$u2|gi;
                $p->setTitle($a);
             }
:
3x ovenstående for andre members
:
          $p->save();  #save changed state
       }
}

:
:

Det bliver mange gange hurtigere hvis jeg bytter om på de to
loops, så jeg kun læser og skriver hver fil engang, problemet
er blot at så når jeg ca. 25% igennem filerne, og så dræbes
programmet fordi den løber tør for hukommelse
(perl er ~300Mb på det tidspunkt).

Memory problemet forsvinder når jeg addere /o til
substitutionerne men så fungere de ikke, dvs. det er relateret
til Perls måde at handle strenge i regex udtryk, jeg troede
først at det var mit objekt der var problemet, men det er ikke
tilfældet.

Jeg har også rodet med at erstatte $regex[$i] med $r=qr/$regex[$i]/,
men enten løber jeg tør for hukommelse, eller også så
fungere substitutionen ikke som den skal.

Jeg kører bruger Perl 5.8.1

-- 
  Mvh. Carsten Svaneborg
http://www.softwarepatenter.dk


 
Home   Subscribe   Mail Archive   Index   Calendar   Search

 
 
Questions about the web-pages to <www_admin>. Last modified 2005-08-10, 22:43 CEST [an error occurred while processing this directive]
This page is maintained by [an error occurred while processing this directive]MHonArc [an error occurred while processing this directive] # [an error occurred while processing this directive] *