<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/'><id>tag:blogger.com,1999:blog-34516211.post3871066359170940878..comments</id><updated>2008-11-15T09:17:14.826-08:00</updated><title type='text'>Comments on My SysAd Blog -- Unix: Split XML Records with Perl Script</title><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.mysysad.com/feeds/3871066359170940878/comments/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html'/><author><name>esofthub</name><uri>http://www.blogger.com/profile/10822058426751039502</uri><email>esofthub@gmail.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>9</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-34516211.post-6790794767839342605</id><published>2008-10-05T19:19:00.000-07:00</published><updated>2008-10-05T19:19:00.000-07:00</updated><title type='text'>If you use lexical filehandles you not only circum...</title><content type='html'>If you use lexical filehandles you not only circumvent Bloggers mangling, but also make your live easier.&lt;BR/&gt;&lt;BR/&gt;E.g.:&lt;BR/&gt;&lt;BR/&gt;open my $fh, &amp;#39;&amp;lt;&amp;#39;, $filename or die &amp;quot;Can&amp;#39;t open &amp;#39;$filename&amp;#39; for reading: $!&amp;quot;;&lt;BR/&gt;while ( my $line = &amp;lt;$fh&amp;gt; ) {&lt;BR/&gt;...&lt;BR/&gt;}&lt;BR/&gt;close $fh or die &amp;quot;Can&amp;#39;t close &amp;#39;$filename&amp;#39;: $!&amp;quot;;&lt;BR/&gt;&lt;BR/&gt;Notice that I&amp;#39;ve added some more good Perl programming advice (like correctly reporting errors, using 3 parameter version of open, etc.).&lt;BR/&gt;&lt;BR/&gt;Also:&lt;BR/&gt;$file = @ARGV[ 0 ];&lt;BR/&gt;&lt;BR/&gt;is technically not correct, you want:&lt;BR/&gt;&lt;BR/&gt;$file = $ARGV[ 0 ];&lt;BR/&gt;&lt;BR/&gt;or&lt;BR/&gt;&lt;BR/&gt;$file = shift;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/6790794767839342605'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/6790794767839342605'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html?showComment=1223259540000#c6790794767839342605' title=''/><author><name>John Bokma</name><uri>http://johnbokma.com/</uri><email>noreply@blogger.com</email></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html' ref='tag:blogger.com,1999:blog-34516211.post-3871066359170940878' source='http://www.blogger.com/feeds/34516211/posts/default/3871066359170940878' type='text/html'/></entry><entry><id>tag:blogger.com,1999:blog-34516211.post-5250545638363050503</id><published>2008-08-16T06:29:29.527-07:00</published><updated>2008-08-16T06:29:29.527-07:00</updated><title type='text'>Starfrit,I just ran it. It worked fine. Maybe Blog...</title><content type='html'>Starfrit,&lt;BR/&gt;&lt;BR/&gt;I just ran it. It worked fine. Maybe Blogger parsed a tag. Did you do a copy and paste? Note below, I added an underscore, &amp;quot;_&amp;quot; in the while statement, &amp;lt;_FH&amp;gt;, because Blogger was complaining about during the submission of this comment. Remove it in your real script.&lt;BR/&gt;&lt;BR/&gt;# more splitter.pl&lt;BR/&gt;#!/usr/bin/perl&lt;BR/&gt;$file = @ARGV[0];&lt;BR/&gt;&lt;BR/&gt;open(FH, &amp;quot;&amp;lt; $file&amp;quot;) or die &amp;quot;Unable to open file\n&amp;quot;;&lt;BR/&gt;&lt;BR/&gt;$count = 0;&lt;BR/&gt;$files_counter=1;&lt;BR/&gt;$max_records = 300;&lt;BR/&gt;&lt;BR/&gt;while ( &amp;lt;_FH&amp;gt; )&lt;BR/&gt;{&lt;BR/&gt;if($count == 0)&lt;BR/&gt;{&lt;BR/&gt;$filename = $file . &amp;quot;_part_&amp;quot; . $files_counter;&lt;BR/&gt;open(FH2, &amp;quot;&amp;gt; $filename&amp;quot;) or die &amp;quot;Unable to open file: $filename\n&amp;quot;;&lt;BR/&gt;$count++;&lt;BR/&gt;}&lt;BR/&gt;&lt;BR/&gt;if (grep /&amp;lt;\/item&amp;gt;/, $_ )&lt;BR/&gt;{&lt;BR/&gt;$count++;&lt;BR/&gt;}&lt;BR/&gt;&lt;BR/&gt;print FH2 $_;&lt;BR/&gt;&lt;BR/&gt;if ($count == $max_records + 1)&lt;BR/&gt;{&lt;BR/&gt;$count = 0;&lt;BR/&gt;$files_counter++;&lt;BR/&gt;close(FH2);&lt;BR/&gt;}&lt;BR/&gt;}&lt;BR/&gt;&lt;BR/&gt;Now run it.&lt;BR/&gt;#./splitter.pl myxml.xml&lt;BR/&gt;&lt;BR/&gt;Here&amp;#39;s the output...&lt;BR/&gt;# ls -l myxml.xml*&lt;BR/&gt;-rwxrwxrwx 1 root other 25372146 Sep 19 2007 myxml.xml&lt;BR/&gt;-rw-r--r-- 1 root other 308835 Aug 16 22:05 myxml.xml_part_1&lt;BR/&gt;-rw-r--r-- 1 root other 282294 Aug 16 22:05 myxml.xml_part_10&lt;BR/&gt;-rw-r--r-- 1 root other 295288 Aug 16 22:05 myxml.xml_part_11&lt;BR/&gt;-rw-r--r-- 1 root other 298320 Aug 16 22:05 myxml.xml_part_12&lt;BR/&gt;-rw-r--r-- 1 root other 303570 Aug 16 22:05 myxml.xml_part_13&lt;BR/&gt;-rw-r--r-- 1 root other 297563 Aug 16 22:05 myxml.xml_part_14&lt;BR/&gt;-rw-r--r-- 1 root other 304841 Aug 16 22:05 myxml.xml_part_15&lt;BR/&gt;-rw-r--r-- 1 root other 298786 Aug 16 22:05 myxml.xml_part_16&lt;BR/&gt;-rw-r--r-- 1 root other 293452 Aug 16 22:05 myxml.xml_part_17&lt;BR/&gt;-rw-r--r-- 1 root other 293747 Aug 16 22:05 myxml.xml_part_18&lt;BR/&gt;-rw-r--r-- 1 root other 309752 Aug 16 22:05 myxml.xml_part_19&lt;BR/&gt;-rw-r--r-- 1 root other 284603 Aug 16 22:05 myxml.xml_part_2&lt;BR/&gt;-rw-r--r-- 1 root other 304117 Aug 16 22:05 myxml.xml_part_20&lt;BR/&gt;-rw-r--r-- 1 root other 310830 Aug 16 22:05 myxml.xml_part_21&lt;BR/&gt;...</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/5250545638363050503'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/5250545638363050503'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html?showComment=1218893369527#c5250545638363050503' title=''/><author><name>esofthub</name><uri>http://www.blogger.com/profile/10822058426751039502</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='11633914086874244668'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html' ref='tag:blogger.com,1999:blog-34516211.post-3871066359170940878' source='http://www.blogger.com/feeds/34516211/posts/default/3871066359170940878' type='text/html'/></entry><entry><id>tag:blogger.com,1999:blog-34516211.post-1176870149066472110</id><published>2008-08-14T16:32:00.000-07:00</published><updated>2008-08-14T16:32:00.000-07:00</updated><title type='text'>I tried it because I'm expreriencing the same prob...</title><content type='html'>I tried it because I'm expreriencing the same problem, but the script doesn't work.&lt;BR/&gt;&lt;BR/&gt;I'm having a Syntax error at split.pl line 29, near : (:wq!)&lt;BR/&gt;Execution aborted because of compilation errors.&lt;BR/&gt;&lt;BR/&gt;What do I have to do to correct the case ?</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/1176870149066472110'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/1176870149066472110'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html?showComment=1218756720000#c1176870149066472110' title=''/><author><name>starfrit</name><email>noreply@blogger.com</email></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html' ref='tag:blogger.com,1999:blog-34516211.post-3871066359170940878' source='http://www.blogger.com/feeds/34516211/posts/default/3871066359170940878' type='text/html'/></entry><entry><id>tag:blogger.com,1999:blog-34516211.post-8850556116799633522</id><published>2008-01-18T09:19:56.957-08:00</published><updated>2008-01-18T09:19:56.957-08:00</updated><title type='text'>Erek,IMHO, they had other problems. I created anot...</title><content type='html'>Erek,&lt;BR/&gt;&lt;BR/&gt;IMHO, they had other problems. I created another tool and they wouldn't let me run it because special requirements were needed.&lt;BR/&gt;&lt;BR/&gt;At first, I tried to load the larger XML file but it was killing their CPU. Then I used this splitter to better manage it. For some reason, my web host was even sensitive to these smaller files after several uploads. I guess you get what you paid for.&lt;BR/&gt;&lt;BR/&gt;ux-admin is right -- I have bigger issues. Allowing someone else to run my site is crap but my choices are limited right now. I'm looking for an alternative solution.&lt;BR/&gt;&lt;BR/&gt;Interestingly enough, the servers I work with have several web servers and are very powerful. The web servers are just minor points.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/8850556116799633522'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/8850556116799633522'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html?showComment=1200676796957#c8850556116799633522' title=''/><author><name>esofthub</name><uri>http://www.blogger.com/profile/10822058426751039502</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='11633914086874244668'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html' ref='tag:blogger.com,1999:blog-34516211.post-3871066359170940878' source='http://www.blogger.com/feeds/34516211/posts/default/3871066359170940878' type='text/html'/></entry><entry><id>tag:blogger.com,1999:blog-34516211.post-4062781980865537889</id><published>2008-01-18T08:49:00.000-08:00</published><updated>2008-01-18T08:49:00.000-08:00</updated><title type='text'>I'm assuming a simple file upload isn't causing th...</title><content type='html'>I'm assuming a simple file upload isn't causing their CPU to go crazy, or they've got other problems, so I'm assuming that you're parsing the XML on the server side as it's uploaded.&lt;BR/&gt;&lt;BR/&gt;Have you considered nicing the receiving process down, or adding some sleep statements between iterations that'll make your % CPU usage seem lower?&lt;BR/&gt;&lt;BR/&gt;Cheers,&lt;BR/&gt;Erek</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/4062781980865537889'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/4062781980865537889'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html?showComment=1200674940000#c4062781980865537889' title=''/><author><name>Erek Dyskant</name><uri>http://erek.blumenthals.com/blog/</uri><email>noreply@blogger.com</email></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html' ref='tag:blogger.com,1999:blog-34516211.post-3871066359170940878' source='http://www.blogger.com/feeds/34516211/posts/default/3871066359170940878' type='text/html'/></entry><entry><id>tag:blogger.com,1999:blog-34516211.post-2205361732890739904</id><published>2008-01-13T04:25:06.813-08:00</published><updated>2008-01-13T04:25:06.813-08:00</updated><title type='text'>ux-admin,From previous posts, I know you're a big ...</title><content type='html'>ux-admin,&lt;BR/&gt;&lt;BR/&gt;From previous posts, I know you're a big AWK supporter.&lt;BR/&gt;&lt;BR/&gt;I thought about that scenario many times but living in South Korea as a US expat legally limits my enterprise options on their soil. If I lived in the US, it would be a completely different story.&lt;BR/&gt;&lt;BR/&gt;Believe me, I hate depending on someone else and almost to a fault. Obviously, I'm still looking for a solution to this problem...&lt;BR/&gt;&lt;BR/&gt;Ideas?</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/2205361732890739904'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/2205361732890739904'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html?showComment=1200227106813#c2205361732890739904' title=''/><author><name>esofthub</name><uri>http://www.blogger.com/profile/10822058426751039502</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='11633914086874244668'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html' ref='tag:blogger.com,1999:blog-34516211.post-3871066359170940878' source='http://www.blogger.com/feeds/34516211/posts/default/3871066359170940878' type='text/html'/></entry><entry><id>tag:blogger.com,1999:blog-34516211.post-6116557555037663587</id><published>2008-01-13T02:19:00.000-08:00</published><updated>2008-01-13T02:19:00.000-08:00</updated><title type='text'>Right. AWK is that perfect tool for the job in thi...</title><content type='html'>Right. AWK is that perfect tool for the job in this particular case.&lt;BR/&gt;&lt;BR/&gt;Anyways, if the web hoster is giving you crap, why just not roll out your own infrastructure?&lt;BR/&gt;&lt;BR/&gt;All you need is a basement with electricity, a used rack, and some sniping on ebay, or, if you're good with logistics, brand new servers at fraction of price of the those cheap DELLs (some assembly required).&lt;BR/&gt;&lt;BR/&gt;I'm kind of surprised. With your expertise, I didn't think you'd stand for depending on someone else for hosting.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/6116557555037663587'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/6116557555037663587'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html?showComment=1200219540000#c6116557555037663587' title=''/><author><name>UX-admin</name><email>noreply@blogger.com</email></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html' ref='tag:blogger.com,1999:blog-34516211.post-3871066359170940878' source='http://www.blogger.com/feeds/34516211/posts/default/3871066359170940878' type='text/html'/></entry><entry><id>tag:blogger.com,1999:blog-34516211.post-3287914516623379675</id><published>2008-01-12T09:41:44.892-08:00</published><updated>2008-01-12T09:41:44.892-08:00</updated><title type='text'>A very good point jean-marc liotier.I wrote an AWK...</title><content type='html'>A very good point jean-marc liotier.&lt;BR/&gt;&lt;BR/&gt;I wrote an AWK-based parser script to get it into the XML format, which I shared a few months ago. &lt;BR/&gt;&lt;BR/&gt;Anyways, I'll make the changes to better reflect its functionality. For our small project, it was a steady stream.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/3287914516623379675'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/3287914516623379675'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html?showComment=1200159704892#c3287914516623379675' title=''/><author><name>esofthub</name><uri>http://www.blogger.com/profile/10822058426751039502</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='11633914086874244668'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html' ref='tag:blogger.com,1999:blog-34516211.post-3871066359170940878' source='http://www.blogger.com/feeds/34516211/posts/default/3871066359170940878' type='text/html'/></entry><entry><id>tag:blogger.com,1999:blog-34516211.post-4361157659352232095</id><published>2008-01-12T08:55:00.000-08:00</published><updated>2008-01-12T08:55:00.000-08:00</updated><title type='text'>Handy, but it is actually more a XML splitter than...</title><content type='html'>Handy, but it is actually more a XML splitter than a XML parser. In addition it assumes the flat structure of a steady stream of items.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/4361157659352232095'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/34516211/3871066359170940878/comments/default/4361157659352232095'/><link rel='alternate' type='text/html' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html?showComment=1200156900000#c4361157659352232095' title=''/><author><name>Jean-Marc Liotier</name><uri>http://www.blogger.com/profile/10064902723155303224</uri><email>noreply@blogger.com</email></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://www.mysysad.com/2008/01/parse-xml-records-with-perl-script.html' ref='tag:blogger.com,1999:blog-34516211.post-3871066359170940878' source='http://www.blogger.com/feeds/34516211/posts/default/3871066359170940878' type='text/html'/></entry></feed>