I've written a program to process a text file. The input file is generated on a 
DOS computer and transferred to my Linux host. Most of the time, the operator 
remembers to transfer it BINARY and not ASCII, so that the DOS line endings are 
preserved. However, occasionally, they forget, and the program doesn't produce 
any output.

I've pasted in two records from the input file below. It is an extraction from 
Medline, and every record ends in two CR/LF combinations.

My program has sections in it like:

$/="\015\012\015\012";  # Read a whole records (separated by a blank line) at a 
time.
$\="\015\012";  # Output line termination is CRLF (for DOS)

and

while (<>) {
   chomp;

   # These single entry fields can only be one single line.
   my($dp) = /DP  - (.*?)\015\012/;
   my($ip) = /IP  - (.*?)\015\012/;
   my($pg) = /PG  - (.*?)\015\012/;

You can see I hard-coded the CR/LF in as '\015\012'.

Can anyone suggest a way to modify my program so that it doesn't matter if the 
line endings are DOS or Unix? My first attempt was to just change the lines to 
this:
   my($dp) = /DP  - (.*?)\015?\012/;
   my($ip) = /IP  - (.*?)\015?\012/;
   my($pg) = /PG  - (.*?)\015?\012/;

But, I didn't think that this would work:

$/="\015?\012\015?\012";  # Read a whole records (separated by a blank line) at 
a time.

Can anyone suggest a better solution?

Thanks in advance for your advice and help.

-Kevin

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139
=============================================
PMID- 18774411
OWN - NLM
STAT- MEDLINE
DA  - 20080908
DCOM- 20080919
PUBM- Print
IS  - 1474-547X (Electronic)
VI  - 372
IP  - 9641
DP  - 2008 Sep 6
TI  - Therapeutic hypothermia for birth asphyxia in low-resource settings: a 
pilot
      randomised controlled trial.
PG  - 801-3
FAU - Robertson, Nicola J
AU  - Robertson NJ
FAU - Nakakeeto, Margaret
AU  - Nakakeeto M
FAU - Hagmann, Cornelia
AU  - Hagmann C
FAU - Cowan, Frances M
AU  - Cowan FM
FAU - Acolet, Dominique
AU  - Acolet D
FAU - Iwata, Osuke
AU  - Iwata O
FAU - Allen, Elizabeth
AU  - Allen E
FAU - Elbourne, Diana
AU  - Elbourne D
FAU - Costello, Anthony
AU  - Costello A
FAU - Jacobs, Ian
AU  - Jacobs I
LA  - eng
PT  - Letter
PT  - Randomized Controlled Trial
PL  - England
TA  - Lancet
JT  - Lancet
JID - 2985213R
SB  - AIM
SB  - IM
MH  - Apgar Score
MH  - Asphyxia Neonatorum/complications/*therapy
MH  - Birth Weight
MH  - Humans
MH  - Hypothermia, Induced/*methods
MH  - Hypoxia, Brain/classification/etiology/prevention & control
MH  - Infant, Newborn
MH  - Length of Stay
MH  - Pilot Projects
MH  - Severity of Illness Index
MH  - Treatment Outcome
MH  - Uganda
EDAT- 2008/09/09 09:00
MHDA- 2008/09/20 09:00
AID - S0140-6736(08)61329-X [pii]
AID - 10.1016/S0140-6736(08)61329-X [doi]
PST - ppublish
SO  - Lancet. 2008 Sep 6;372(9641):801-3.

PMID- 18770893
OWN - NLM
STAT- MEDLINE
DA  - 20080903
DCOM- 20080916
PUBM- Print
IS  - 1474-5488 (Electronic)
VI  - 9
IP  - 9
DP  - 2008 Sep
TI  - H. pylori and gastric cancer in Asia: enigma, or a play on words?
PG  - 827
FAU - Sharma, Sharan Prakash
AU  - Sharma SP
LA  - eng
PT  - News
PL  - England
TA  - Lancet Oncol
JT  - The lancet oncology
JID - 100957246
SB  - IM
MH  - Antibiotic Prophylaxis
MH  - Asia/epidemiology
MH  - Helicobacter Infections/*epidemiology/prevention & control
MH  - *Helicobacter pylori
MH  - Humans
MH  - Risk Factors
MH  - Stomach Neoplasms/epidemiology/*microbiology/prevention & control
EDAT- 2008/09/05 09:00
MHDA- 2008/09/17 09:00
PST - ppublish
SO  - Lancet Oncol. 2008 Sep;9(9):827.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to