Wang Zhong wrote: > You can try using FSDataOutputStream in reduce phase. Create a file > with FSDataOutputStream by the method below: > > ==== > FileSystem fs = FileSystem.get(conf); > OutputStream os = fs.create(path); > os.writeChars(str); > ==== > > You should call writeChars in each iteration of your values but not > use a StringBuffer. The key should be part of your file name to > indicate the group of URIs. > > > On Wed, Apr 29, 2009 at 2:56 PM, nguyenhuynh.mr > <[email protected]> wrote: > >> Wang Zhong wrote: >> >> >>> Where did you get the large string? Can't you generate the string one >>> line per time and append it to local files, then upload to HDFS when >>> finished? >>> >>> On Wed, Apr 29, 2009 at 10:47 AM, nguyenhuynh.mr >>> <[email protected]> wrote: >>> >>> >>>> Hi all! >>>> >>>> >>>> I have the large String and I want to write it into the file in HDFS. >>>> >>>> (The large string has >100.000 lines.) >>>> >>>> >>>> Current, I use method copyBytes of class org.apache.hadoop.io.IOUtils. >>>> But the copyBytes request the InputStream of content. Therefore, I have >>>> to convert the String to InputStream, some things like: >>>> >>>> >>>> >>>> InputStream in=new ByteArrayInputStream(sb.toString().getBytes()); >>>> >>>> The "sb" is a StringBuffer. >>>> >>>> >>>> It not work with the command line above. :( >>>> >>>> There is the error: >>>> >>>> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space >>>> at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232) >>>> at java.lang.StringCoding.encode(StringCoding.java:272) >>>> at java.lang.String.getBytes(String.java:947) >>>> at asnet.haris.mapred.jobs.Test.main(Test.java:32) >>>> >>>> >>>> >>>> Please give me the good solution! >>>> >>>> >>>> Thanks, >>>> >>>> >>>> Best regards, >>>> >>>> Nguyen, >>>> >>>> >>>> >>>> >>>> >>>> >>> >>> >>> >> Thanks for your answer! >> >> I have Map/Reduce job. It partition URI from HBase into groups URIs. >> In the map phase, get group name of the URI and collect output >> <groupname, uri>. >> In the reduce phase, I get the String (URIs of the partition) and save >> into HDFS. >> Each group is a file. >> >> Thanks, >> >> Best regards, >> NguyenHuynh. >> >> >> > > > > Thanks very much!
Best, Nguyen.
