On 2017-09-26 11:35, Joris Meys wrote:
I don't like the dropping of dimensions either. That doesn't change the fact that a tibble reacts different from a data.frame. So tibbles do not inherit correctly from the class data.frame, and it can thus be argued that it's against OOP paradigms to pretend tibbles inherit from the class data.frame. Defensive coding techniques would check if it's a tibble and return an error saying a data.frame is expected. Unless tibbles inherit correctly from data.frame.

The correct and logical way (which I use in 'eha') is to check if input is a data frame, and if not, throw an error. Checking for other things would soon be too overwhelming.


I have nothing against tibbles. But calling them "data.frame" raises expectations that can't be fulfilled.

Exactly what I think. I wouldn't object to changing base data frames to behave like tibbles (with a few exceptions).

Göran


On Tue, Sep 26, 2017 at 11:23 AM, Stefan McKinnon Høj-Edwards <s...@iysik.com <mailto:s...@iysik.com>> wrote:

    Thanks for the examples. Personally, I have been struck out multiple
    times by data frames dropping dimensions, so I have a distaste for
    this dropping behaviour.

    Personally, I prefer data frame *not* to drop dimensions. They are
    not arrays, where slicing drops a dimension makes sense because all
    entries are same data type.
    You can pull out a column in vector form from both tribbles and data
    frame with the $ index; subsetting a row from a data frame and
    forcing it into an atomic vector will require cast all columns to
    lowest common denominator, often character.

    So I would argue that yes, tribbles are data.frame with extra bells
    and whistles, even if I do not understand the use of list columns.

    I suggest a defensive coding technique; if you need a data frame
    subset to really be a vector, cast it as a vector. Users *will*
    attempt to throw unexpected structures at your methods. When your
    methods fails in mysterious ways because it didn't extract a vector,
    users will be stupefied. Fail at `as.vector` will indicate why.

    Kindly,
    Stefan

    Stefan McKinnon Høj-Edwards
    ph.d. Genetics
    +44 (0)776 231 2464 <tel:+44%207762%20312464>
    +45 2888 6598 <tel:+45%2028%2088%2065%2098>
    Skype: stefan_edwards

    2017-09-26 10:05 GMT+01:00 Joris Meys <joris.m...@ugent.be
    <mailto:joris.m...@ugent.be>>:

        Here's one difference:

        atib <- tibble(a = 1:5, b = letters[5:1])
        atib[3,"a"]
        as.data.frame(atib)[3,"a"]

        The second line returns a tibble (no dropping dimensions), the
        third line does (dropping dimensions). Huge difference if you
        use [ , aColumn] to select a vector from a data frame.

        Cheers
        Joris

        On Tue, Sep 26, 2017 at 10:57 AM, Stefan McKinnon Høj-Edwards
        <s...@iysik.com <mailto:s...@iysik.com>> wrote:

            Hi Göran,

            Could you please elaborate on which kind of subsetting that
            Hadley dislikes?
            I am yet to encounter operations on data frames that are not
            possible on
            tribbles.

            Kindly,
            Stefan McKinnon Hoj-Edwards

            Stefan McKinnon Høj-Edwards
            ph.d. Genetics
            +44 (0)776 231 2464 <tel:%2B44%20%280%29776%20231%202464>
            +45 2888 6598 <tel:%2B45%202888%206598>
            Skype: stefan_edwards

            2017-09-26 8:30 GMT+01:00 Göran Broström
            <goran.brost...@umu.se <mailto:goran.brost...@umu.se>>:

             > I am beginning to get complaints from users of my CRAN
            packages
             > (especially 'eha') to the effect that they get error
            messages like "Error:
             > Unsupported use of matrix or array for column indexing".
             >
             > It turns out that they are sticking in tibbles into
            functions that expect
             > data frames as input. And I am using the kind of
            subsetting that Hadley
             > dislikes (eha is an old package, much older than
            tibbles). It is of course
             > a simple matter to change the code so it handles both
            data frames and
             > tibbles correctly, but this affects many functions, and
            it will take some
             > time. And when the next guy introduces 'troubles' as an
            improvement of
             > 'tibbles', I will have to rewrite the code again.
             >
             > While I like Hadley's way of doing it, I think it is a
            mistake to let a
             > tibble also be of class data frame. To me it is a matter
            of inheritance and
             > backwards compability: A tibble should add nice things to
            a data frame, not
             > change basic behaviour, in order to call itself a data frame.
             >
             > Is it correct to let a tibble be of class "data.frame"?
             >
             > Göran Broström
             >
             > ______________________________________________
             > R-package-devel@r-project.org
            <mailto:R-package-devel@r-project.org> mailing list
             > https://stat.ethz.ch/mailman/listinfo/r-package-devel
            <https://stat.ethz.ch/mailman/listinfo/r-package-devel>

                     [[alternative HTML version deleted]]

            ______________________________________________
            R-package-devel@r-project.org
            <mailto:R-package-devel@r-project.org> mailing list
            https://stat.ethz.ch/mailman/listinfo/r-package-devel
            <https://stat.ethz.ch/mailman/listinfo/r-package-devel>




-- Joris Meys
        Statistical consultant

        Ghent University
        Faculty of Bioscience Engineering
        Department of Mathematical Modelling, Statistics and Bio-Informatics

        tel : +32 9 264 59 87 <tel:+32%209%20264%2059%2087>
        joris.m...@ugent.be
        -------------------------------
        Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
        <http://helpdesk.ugent.be/e-maildisclaimer.php>





--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel : +32 9 264 59 87
joris.m...@ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Reply via email to