There is nothing in the system.log when the aggregation query fails.

Thanks for the Datastax clarification.

Thanks,
Dinesh.

On 12/24/2015 2:46 PM, DuyHai Doan wrote:
The exception stack trace at client side shows some issue with File Permission. Try to look for the same error message in system.log to chase down the root issue.

"Would trying the Datastax distribution offer any better chances?" --> No, DSC is just a packaging of C* OSS

On Thu, Dec 24, 2015 at 7:07 AM, Dinesh Shanbhag <dinesh.shanb...@isanasystems.com <mailto:dinesh.shanb...@isanasystems.com>> wrote:


    Even if aggregation that forces a full table scan across
    partitions is not recommended, the message/exception does seems
    unrelated to partitioning:

       cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
       flightsbydate in ('2015-09-15', '2015-09-16',
       '2015-09-17', '2015-09-18', '2015-09-19', '2015-09-20',
    '2015-09-21');

       Traceback (most recent call last):
          File "CassandraInstall-3.1/bin/cqlsh.py", line 1258, in
       perform_simple_statement
            result = future.result()
          File
     
"/home/wpl/CassandraInstall-3.1/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py",

       line 3122, in result
            raise self._final_exception
       FunctionFailure: code=1400 [User Defined Function failure]
       message="execution of 'flightdata.state_late_flights[map<text,
       frozen<tuple<int, int>>>, text, decimal]' failed:
       java.security.AccessControlException: access denied
       ("java.io.FilePermission"
       "/home/wpl/CassandraInstall-3.1/conf/logback.xml" "read")"

    Is that right?

    And note that this same aggregation query (on a subset of the
    month's days) does complete successfully sometimes.

    The behavior is similar with Cassandra 3.0 as well: on the same
set of days, the query sometimes succeeds, fails most times. Would trying the Datastax distribution offer any better chances?

    Thanks,
    Dinesh.


    On 12/24/2015 2:59 AM, DuyHai Doan wrote:

        Thanks for the pointer on internal paging Tyler, I missed this
        one. But then it raises some questions:

        1. Is it possible to "tune" the page size or is it hard-coded
        internally ?
        2. Is read-repair performed on EACH page or is it done on the
        whole requested rows once they are fetched ?

        Question 2. is relevant in some particular scenarios when the
        user is using CL QUORUM (or more) and some replicas are
        out-of-sync. Even in the case of aggregation over a single
        partition, if this partition is wide and spans many fetch
        pages, the time the coordinator performs all the read-repair
        and reconcile over QUORUM replicas, the query may timeout very
        quickly.


        On Fri, Dec 18, 2015 at 5:26 PM, Tyler Hobbs
        <ty...@datastax.com <mailto:ty...@datastax.com>
        <mailto:ty...@datastax.com <mailto:ty...@datastax.com>>> wrote:


            On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan
        <doanduy...@gmail.com <mailto:doanduy...@gmail.com>
            <mailto:doanduy...@gmail.com
        <mailto:doanduy...@gmail.com>>> wrote:

                Cassandra will perform a full table scan and fetch all the
                data in memory to apply the aggregate function.


            Just to clarify for others on the list: when executing
        aggregation
            functions, Cassandra /will/ use paging internally, so at
        most one
page worth of data will be held in memory at a time. However, if
            your aggregation function retains a large amount of data,
        this may
            contribute to heap pressure.


            --     Tyler Hobbs
            DataStax <http://datastax.com/>





Reply via email to