Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-17 Thread via GitHub
logan-keede commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2728637022 > Hey [@logan-keede](https://github.com/logan-keede) please ping me in ASF slack, I'm not using discord now @comphead I pinged you on slack. -- This is an autom

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-16 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2727619833 Hey @logan-keede please ping me in ASF slack, I'm not using discord now -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-16 Thread via GitHub
logan-keede commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2727534337 Hi @comphead, I would really appreciate if you can give me some feedback for my GSoC proposal. Let me know if that is feasible or if there is anything else that I can do

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-11 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2711029575 Hey @logan-keede I would think this ticket is a good fit for GSoC https://github.com/apache/datafusion/issues/14510 -- This is an automated message from the Apache Git Servic

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-11 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2711670193 Yeah, compile time would be closer to the original `Software Engineering, Refactoring, Dependency Management, Compilers` title -- This is an automated message from the Apache

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-11 Thread via GitHub
logan-keede commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2711096610 > Hey [@logan-keede](https://github.com/logan-keede) I would think this ticket is a good fit for GSoC [#14510](https://github.com/apache/datafusion/issues/14510) Than

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-11 Thread via GitHub
alamb commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2711634351 > > Hey [@logan-keede](https://github.com/logan-keede) I would think this ticket is a good fit for GSoC [#14510](https://github.com/apache/datafusion/issues/14510) > > Than

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-11 Thread via GitHub
alamb commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2713739585 > Thanks again, I believe this proposal might suit me even better considering my efforts over the last month or two. > You can look forward to a proposal draft soon! : )

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-10 Thread via GitHub
logan-keede commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2711982900 > > > Hey [@logan-keede](https://github.com/logan-keede) I would think this ticket is a good fit for GSoC [#14510](https://github.com/apache/datafusion/issues/14510) > >

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-03-09 Thread via GitHub
logan-keede commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2708902951 Optimizing binary size https://github.com/apache/datafusion/issues/13816 > [Optimizing DataFusion Binary Size](https://github.com/apache/datafusion/issues/13816) C

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-02-26 Thread via GitHub
alamb closed issue #13816: Datafusion binary size has been getting bigger URL: https://github.com/apache/datafusion/issues/13816 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-02-24 Thread via GitHub
alamb commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2678811695 > @alamb WDYT should we dig deeper? I don't think so. It is fascinating how much binary size we can save without unwinding. -- This is an automated message from t

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-02-23 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2677180516 I checked the biggest methods are std panic methods, removing unwind can save even more ``` panic = "abort" ``` ``` du -s -h target/release/datafusion-c

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-02-23 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2677151841 so in the data above(ARM Macos) the biggest parts are - code. compiled instructions 41MB - consts (2-3MB) @alamb WDYT should we dig deeper? -- This is an autom

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-02-23 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2677147665 Stripped binary by inner segments ``` size -m -l target/release/datafusion-cli Segment __PAGEZERO: 4294967296 (zero fill) (vmaddr 0x0 fileoff 0) Segment __TEXT:

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-02-23 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2677140990 Tbh I dont have accurate answer for this, I found it when played with different feature set on latest DF, but I remember some packages were moved out of the core or similar.

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-02-23 Thread via GitHub
alamb commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2676877830 > btw after changes in 45.0.0 the image size is 49M 🎉 Nice! Do you know what changed? Indeed I checked on my mac after doing `cargo build --release` and the size is

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-02-20 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2673189679 btw after changes in 45.0.0 the image size is 49M 🎉 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2025-02-14 Thread via GitHub
alamb commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2659471604 At a high level, I think this ticket has 2 parts: 1. Figure out what is contributing to code size increase 2. Then perhaps figure out how to make it better I think the

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2024-12-24 Thread via GitHub
alamb commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2561469424 FWIW I don't think the size of hte datafusion-cli binary is all that critical per se (maybe we can adjust / optimize the size of what is distributed on homebrew) What I was

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2024-12-24 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2561318792 Thanks @Omega359 Opt-level is 3 by default for the release https://doc.rust-lang.org/cargo/reference/profiles.html#release which focus on maximum runtime speed, I think it is

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2024-12-23 Thread via GitHub
Omega359 commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2560327632 ``` [profile.release] codegen-units = 1 strip = true panic = "abort" opt-level = "s" ``` Expanding on @comphead's idea adding opt-level = "s" reduced the

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2024-12-22 Thread via GitHub
alamb commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2558507731 That is a very cool page 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2024-12-20 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2557916785 Some good experiments are https://github.com/johnthagen/min-sized-rust?tab=readme-ov-file#optimize-libstd-with-xargo with this profile ``` [profile.release]

Re: [I] Datafusion binary size has been getting bigger [datafusion]

2024-12-20 Thread via GitHub
comphead commented on issue #13816: URL: https://github.com/apache/datafusion/issues/13816#issuecomment-2557881624 ``` print_functions_docs print_functions_config ``` binaries can be moved out from the main release -- This is an automated message from the Apache Git Service. To

[I] Datafusion binary size has been getting bigger [datafusion]

2024-12-17 Thread via GitHub
alamb opened a new issue, #13816: URL: https://github.com/apache/datafusion/issues/13816 ### Is your feature request related to a problem or challenge? The size of datafusion's binary has grown significantly in the last few releases This likely leads to higher compile times as