is there a way to write as a temporary file? Also what about a Stream - something like an RSS feed
On Mon, Sep 22, 2014 at 10:21 AM, Victor Tso-Guillen <v...@paxata.com> wrote: > You can write to disk and have Spark read it as a stream. This is how > Hadoop files are iterated in Spark. > > On Mon, Sep 22, 2014 at 9:22 AM, Steve Lewis <lordjoe2...@gmail.com> > wrote: > >> The only way I find is to turn it into a list - in effect holding >> everything in memory (see code below). Surely Spark has a better way. >> >> Also what about unterminated iterables like a Fibonacci series - (useful >> only if limited in some other way ) >> >> >> /** >> * make an RDD from an iterable >> * @param inp input iterator >> * @param ctx context >> * @param <T> type >> * @return rdd from inerator as a list >> */ >> public static @Nonnull <T> JavaRDD<T> fromIterable(@Nonnull final >> Iterable<T> inp,@Nonnull final JavaSparkContext ctx) { >> List<T> holder = new ArrayList<T>(); >> for (T k : inp) { >> holder.add(k); >> } >> return ctx.parallelize(holder); >> } >> >> > -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com