Hi
records count  => more than 160,0000
No of columns => 87 Columns (80 columns data null only)
How big is one row on average, in bytes? Approximately 300MB size

New stats after removing 80 columns with same no of records
BenchmarkDotNet v0.14.0, Windows 10
(10.0.19044.2130/21H2/November2021Update)
Intel Core i5-8500 CPU 3.00GHz (Coffee Lake), 1 CPU, 6 logical and 6
physical cores
.NET SDK 8.0.101
  [Host]     : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2


| Method       | Mean     | Error    | StdDev   | Gen0      | Gen1     |
Gen2     | Allocated |
|-------------
|---------:|---------:|---------:|----------:|---------:|---------:|----------:|
| Linq         | 188.6 ms | 15.08 ms | 43.26 ms | 1000.0000 |        - |
     - | 126.01 MB |
| LinqCompiled | 166.8 ms |  3.24 ms |  5.23 ms |  666.6667 | 666.6667 |
333.3333 | 121.18 MB |

Thanks
Charlin

On Fri, 22 Nov 2024 at 00:07, Pavel Tupitsyn <ptupit...@apache.org> wrote:

> So how is the real data different from the test data?
> - How many rows are there?
> - How many columns?
> - How big is one row on average, in bytes?
>
> On Thu, Nov 21, 2024 at 7:08 PM Charlin S <charli...@hotelhub.com> wrote:
>
>> Hi,
>> The summary with real data :-
>> BenchmarkDotNet v0.14.0, Windows 10
>> (10.0.19044.2130/21H2/November2021Update)
>> Intel Core i5-8500 CPU 3.00GHz (Coffee Lake), 1 CPU, 6 logical and 6
>> physical cores
>> .NET SDK 8.0.101
>>   [Host]     : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>   DefaultJob : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>
>>
>> | Method       | Mean     | Error    | StdDev   | Median   | Gen0      |
>> Gen1      | Gen2      | Allocated |
>> |-------------
>> |---------:|---------:|---------:|---------:|----------:|----------:|----------:|----------:|
>> | Linq         | 530.0 ms | 17.28 ms | 48.46 ms | 512.4 ms | 7000.0000 |
>> 3000.0000 | 1000.0000 | 999.57 MB |
>> | LinqCompiled | 543.3 ms | 21.48 ms | 59.53 ms | 525.3 ms | 6000.0000 |
>> 2000.0000 | 1000.0000 | 994.65 MB |
>>
>> // * Hints *
>> Outliers
>>   Benchmarks.Linq: Default         -> 9 outliers were removed (722.82
>> ms..1.42 s)
>>   Benchmarks.LinqCompiled: Default -> 11 outliers were removed (767.85
>> ms..2.67 s)
>>
>> Thanks
>> Charlin
>>
>> On Thu, 21 Nov 2024 at 16:01, Pavel Tupitsyn <ptupit...@apache.org>
>> wrote:
>>
>>> Can you adjust this benchmark project to have the data and query more
>>> like your real project?
>>> Currently we have a similar number of rows and the query performs very
>>> well, let's understand what is different.
>>>
>>> On Thu, Nov 21, 2024 at 11:43 AM Charlin S <charli...@hotelhub.com>
>>> wrote:
>>>
>>>> Hi,
>>>> Thank you for your email. I apologize for the late reply.
>>>> I ran the sample and summaries are:-
>>>>
>>>> BenchmarkDotNet v0.14.0, Windows 10
>>>> (10.0.19044.2130/21H2/November2021Update)
>>>> Intel Core i5-8500 CPU 3.00GHz (Coffee Lake), 1 CPU, 6 logical and 6
>>>> physical cores
>>>> .NET SDK 8.0.101
>>>>   [Host]     : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>>>   DefaultJob : .NET 8.0.1 (8.0.123.58001), X64 RyuJIT AVX2
>>>>
>>>>
>>>> | Method           | Mean        | Error    | StdDev   |
>>>> |--------------------|--------------:|----------:|---------------:|
>>>> | Linq                |   94.60 us | 1.515 us | 1.417 us |
>>>> | LinqCompiled |  16.50 us  | 0.328 us | 0.449 us |
>>>>
>>>> LINQCompiled much faster than LINQ in sample code but with real data
>>>> it's not much different.
>>>>
>>>>
>>>> Thanks
>>>> Charlin
>>>>
>>>> On Thu, 14 Nov 2024 at 20:30, Pavel Tupitsyn <ptupit...@apache.org>
>>>> wrote:
>>>>
>>>>> 1. Run the code in Release configuration
>>>>> 2. Try Compiled Query
>>>>> https://ignite.apache.org/docs/latest/net-specific/net-linq#compiled-queries
>>>>>
>>>>> My benchmark on 550_000 rows with 1 matching row shows 28 microseconds
>>>>> for regular query and 6 microseconds for compiled query:
>>>>> https://gist.github.com/ptupitsyn/189c8164061bace8d975b2ec39045ca4
>>>>>
>>>>> 200 ms is a lot. Can you run my code and share the results?
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Nov 14, 2024 at 1:46 PM Charlin S <charli...@hotelhub.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>> I was measuring performance after 5 times. Now I have run more than
>>>>>> 100 times and  var queryRes = query.ToList(); taken 202 milliseconds
>>>>>> as minimum.
>>>>>> Will be appreciated if I get a solution for getting the result in 10
>>>>>> - 50 milliseconds.
>>>>>>
>>>>>> Regards,
>>>>>> Charlin
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, 14 Nov 2024 at 15:35, Pavel Tupitsyn <ptupit...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> How do you measure? Do you perform a warm up before measurement?
>>>>>>> Try running the query 100 times before the measurement. Or, ideally,
>>>>>>> use BenchmarkDotNet to get accurate results.
>>>>>>>
>>>>>>> On Thu, Nov 14, 2024 at 11:51 AM Charlin S <charli...@hotelhub.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> Thank you for your response.and sample.
>>>>>>>> My cache instance was created as TestIcache=_ignite.
>>>>>>>> GetOrCreateCache<string, TestModel>("TestModel"); so index was not
>>>>>>>> applied.
>>>>>>>> now I am creating cache instance like below and index applied
>>>>>>>>   var cacheCfgTestModel = new CacheConfiguration("TestModel", new
>>>>>>>> QueryEntity(typeof(TestModel)));
>>>>>>>>          TestIcache=_ignite.GetOrCreateCache<string, TestModel>(
>>>>>>>> cacheCfgTestModel);
>>>>>>>>
>>>>>>>> SQL:
>>>>>>>> select _T0._KEY, _T0._VAL from "TestModel".TESTMODEL as _T0 where
>>>>>>>> (_T0.COUNTRYCODE IS NOT DISTINCT FROM ?)
>>>>>>>>
>>>>>>>> EXPLANATION:
>>>>>>>> SELECT
>>>>>>>>     _T0__Z0._KEY AS __C0_0,
>>>>>>>>     _T0__Z0._VAL AS __C0_1
>>>>>>>> FROM "TestModel".TESTMODEL _T0__Z0
>>>>>>>>     /* "TestModel".TESTMODEL_COUNTRYCODE_ASC_IDX: COUNTRYCODE IS ?1
>>>>>>>> */
>>>>>>>> WHERE _T0__Z0.COUNTRYCODE IS ?1
>>>>>>>> SELECT
>>>>>>>>     __C0_0 AS _KEY,
>>>>>>>>     __C0_1 AS _VAL
>>>>>>>> FROM PUBLIC.__T0
>>>>>>>>     /* "TestModel"."merge_scan" */
>>>>>>>>
>>>>>>>> Total records:27713 = > 27713
>>>>>>>> Time taken for Query:2 milliseconds ->  var query = TestIcache.
>>>>>>>> AsCacheQueryable().Where(x => x.Value.CountryCode == CountryCode);
>>>>>>>> Time taken for QueryResult:313 milliseconds -> var queryRes = query
>>>>>>>> .ToList();
>>>>>>>> Total Time taken:316 milliseconds
>>>>>>>> Why var queryRes = query.ToList(); taking 313 milliseconds even if
>>>>>>>> the result having only one record is taking nearly 300 milliseconds.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Charlin
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, 13 Nov 2024 at 22:36, Pavel Tupitsyn <ptupit...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> - This shows a full scan, the index is not being used.
>>>>>>>>> - There is no parametrization, I'm not sure why. Which Ignite
>>>>>>>>> version do you use?
>>>>>>>>>
>>>>>>>>> I've prepared a simple proof of concept [1] which shows that an
>>>>>>>>> index is used in this situation. Generated SQL is different:
>>>>>>>>>
>>>>>>>>> >> SQL:
>>>>>>>>> select _T0._KEY, _T0._VAL from "c".TESTMODEL as _T0 where
>>>>>>>>> (_T0.COUNTRYCODE IS NOT DISTINCT FROM ?)
>>>>>>>>>
>>>>>>>>> >> EXPLAIN:
>>>>>>>>> SELECT
>>>>>>>>>     _T0__Z0._KEY AS __C0_0,
>>>>>>>>>     _T0__Z0._VAL AS __C0_1
>>>>>>>>> FROM "c".TESTMODEL _T0__Z0
>>>>>>>>>     /* "c".TESTMODEL_COUNTRYCODE_ASC_IDX: COUNTRYCODE IS ?1 */
>>>>>>>>> WHERE _T0__Z0.COUNTRYCODE IS ?1
>>>>>>>>> SELECT
>>>>>>>>>     __C0_0 AS _KEY,
>>>>>>>>>     __C0_1 AS _VAL
>>>>>>>>> FROM PUBLIC.__T0
>>>>>>>>>     /* "c"."merge_scan" */
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://gist.github.com/ptupitsyn/882b9b9e5e13c82fcf96f83fd53b2777
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Nov 13, 2024 at 6:41 PM Charlin S <charli...@hotelhub.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> 1. Use "==" instead of "string.Equals", looks like the provider
>>>>>>>>>> does not like string.Equals.
>>>>>>>>>>        changed to ==
>>>>>>>>>> 2. Share the generated SQL
>>>>>>>>>>
>>>>>>>>>>   SELECT Field1,Field2 FROM TestModel WHERE COUNTRYCODE='AU'
>>>>>>>>>> 3. Share the result of EXPLAIN for that SQL
>>>>>>>>>>    plan=SELECT
>>>>>>>>>>     __Z0.FIELD1 AS __C0_0,
>>>>>>>>>>     __Z0.FIELD2 AS __C0_1
>>>>>>>>>> FROM "TestModel".TESTMODEL __Z0
>>>>>>>>>>     /* "TestModel".TESTMODEL.__SCAN_ */
>>>>>>>>>>     /* scanCount: 1681786 */
>>>>>>>>>> WHERE (__Z0.COUNTRYCODE = 'AU')
>>>>>>>>>> 4. Is there an index on CountryCode?
>>>>>>>>>>     Yes,  [QuerySqlField(IsIndexed = true)]
>>>>>>>>>>         public string CountryCode { get; set; }
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Charlin
>>>>>>>>>>
>>>>>>>>>> On Wed, 13 Nov 2024 at 15:06, Pavel Tupitsyn <
>>>>>>>>>> ptupit...@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> 1. Use "==" instead of "string.Equals", looks like the provider
>>>>>>>>>>> does not like string.Equals
>>>>>>>>>>> 2. Share the generated SQL
>>>>>>>>>>> 3. Share the result of EXPLAIN for that SQL
>>>>>>>>>>> 4. Is there an index on CountryCode?
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Nov 13, 2024 at 9:16 AM Charlin S <
>>>>>>>>>>> charli...@hotelhub.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> Thanks for your response.
>>>>>>>>>>>> I have tried different ways but the result is the same.
>>>>>>>>>>>> my cache records count is above 160,0000
>>>>>>>>>>>> var watchLINQQuery = System.Diagnostics.Stopwatch.StartNew();
>>>>>>>>>>>>
>>>>>>>>>>>>       var tmp=  TestIcache.AsEnumerable().Where(tc=> string.
>>>>>>>>>>>> Equals(tc.Value.CountryCode, CountryCode)).Select(tc => tc.
>>>>>>>>>>>> Value);
>>>>>>>>>>>>       watchLINQQuery.Stop(); //0 or 1 Milliseconds
>>>>>>>>>>>>       var watchIQueryableToArray = System.Diagnostics.Stopwatch
>>>>>>>>>>>> .StartNew();
>>>>>>>>>>>>       var result = tmp.ToArray(); // 12354 milliseconds taken
>>>>>>>>>>>>       watchIQueryableToArray.Stop();
>>>>>>>>>>>>
>>>>>>>>>>>> var result = tmp.ToArray(); taking similar time even if my
>>>>>>>>>>>> query result returns 1 or 2 records. Please suggest to me how to 
>>>>>>>>>>>> improve
>>>>>>>>>>>> this query performance.
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Charlin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, 5 Nov 2024 at 19:01, Pavel Tupitsyn <
>>>>>>>>>>>> ptupit...@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> 1. Check the generated SQL
>>>>>>>>>>>>>
>>>>>>>>>>>>> // Cast to ICacheQueryable
>>>>>>>>>>>>> var cacheQueryable = (ICacheQueryable) query;
>>>>>>>>>>>>>
>>>>>>>>>>>>> // Get resulting fields query
>>>>>>>>>>>>> SqlFieldsQuery fieldsQuery = cacheQueryable.GetFieldsQuery();
>>>>>>>>>>>>>
>>>>>>>>>>>>> // Examine generated SQL
>>>>>>>>>>>>> Console.WriteLine(fieldsQuery.Sql);
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2. Try EXPLAIN and other suggestions from
>>>>>>>>>>>>> https://ignite.apache.org/docs/latest/SQL/sql-tuning
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3. Is there an index on CountryCode?
>>>>>>>>>>>>>
>>>>>>>>>>>>> 4. Case-insensitive comparison might be inefficient. A better
>>>>>>>>>>>>> approach is to store lower/uppercase value in cache, and then 
>>>>>>>>>>>>> using
>>>>>>>>>>>>> lower/upper criteria for search
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Nov 5, 2024 at 1:08 PM Charlin S <
>>>>>>>>>>>>> charli...@hotelhub.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi  Pavel,
>>>>>>>>>>>>>> Thanks for your email. it reduces to 9 seconds after removing
>>>>>>>>>>>>>> AsParallel. Please let me know if there are any more options to 
>>>>>>>>>>>>>> get good
>>>>>>>>>>>>>> performance.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Charlin
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, 5 Nov 2024 at 13:31, Pavel Tupitsyn <
>>>>>>>>>>>>>> ptupit...@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi, "AsParallel" is the problem, it causes the entire data
>>>>>>>>>>>>>>> set to be loaded locally before filtering.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Remove it so that the LINQ expression can be translated into
>>>>>>>>>>>>>>> Ignite SQL and executed more efficiently.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://ignite.apache.org/docs/latest/net-specific/net-linq
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Nov 5, 2024 at 8:58 AM Charlin S <
>>>>>>>>>>>>>>> charli...@hotelhub.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am trying Ignit.Net LINQ for the first time and seeing
>>>>>>>>>>>>>>>> very slowness with my linq query taking 13-15 seconds. Test 
>>>>>>>>>>>>>>>> model having
>>>>>>>>>>>>>>>> 550,000 records
>>>>>>>>>>>>>>>> my query as below
>>>>>>>>>>>>>>>> TestModel having index for CountryCode field.
>>>>>>>>>>>>>>>> ICache<string, TestModel> cache = ignite.GetCache<string,
>>>>>>>>>>>>>>>> TestModel>(CacheName);
>>>>>>>>>>>>>>>> IQueryable<ICacheEntry<string, TestModel>>  igniteQuerable
>>>>>>>>>>>>>>>> = cache.AsCacheQueryable();
>>>>>>>>>>>>>>>> igniteQuerable.AsParallel()
>>>>>>>>>>>>>>>>             .Where(x=>string.Equals(x.Value.CountryCode,
>>>>>>>>>>>>>>>> criteria.CountryCode, StringComparison.
>>>>>>>>>>>>>>>> CurrentCultureIgnoreCase))
>>>>>>>>>>>>>>>>             .Select(x => x.Key).ToList();
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Charlin
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

Reply via email to