The attached patch implements a bit-packing scheme so that short ranges
can be stored directly within a 32-bit source_location (aka location_t)
without needing to use the ad-hoc table.  The intent is to mitigate the
overhead introduced in the earlier patch that added ranges for all tokens
in libcpp: every token up to 2**N characters long can be stored without
needing the ad-hoc table.  Other short ranges for expressions can be
stored compactly, provided that caret==start.

N currently is 5.  This is somewhat arbitrary, but seems to work.
The default bits for columns remains 7, meaning that the low 12 bits of
ordinary location_t values are for columns&packed ranges.
More details of the packing scheme can be seen in the patch's change
to line-map.h.

default_range_bits == 5 entails a 32-fold reduction in the size of the
code we can compile before the various fallbacks take effect (stopping
tracking columns then stopping tracking locations altogether).
In the former case, when we stop tracking columns, we also stop packing
ranges.

The range_bits needs to be per-ordinary_map, to cope with the case where
an ordinary map's range_and_column bits could be zero (e.g. due to a very
long line).
This requires figuring out which ordinary map a source_location is in
when generating compact ranges.  Luckily, this seems to hit the cached
map when tokenizing, avoiding the somewhat expensive binary search
through the ordinary maps.

Some benchmarks can be seen in this post:
  https://gcc.gnu.org/ml/gcc-patches/2015-10/msg02283.html

gcc/ada/ChangeLog:
        * gcc-interface/trans.c (Sloc_to_locus): Add line_table param when
        calling linemap_position_for_line_and_column.

gcc/ChangeLog:
        * input.c (dump_line_table_statistics): Dump stats on how many
        ranges were optimized vs how many needed ad-hoc table.
        (write_digit_row): Add "map" param; use its range_bits
        to calculate the per-character offset.
        (dump_location_info): Print the range and column bits for each
        ordinary map.  Use the range bits to calculate the per-character
        offset.  Pass the map as a new param to the various calls to
        write_digit_row.  Eliminate uses of
        ORDINARY_MAP_NUMBER_OF_COLUMN_BITS.
        * toplev.c (general_init): Initialize line_table's
        default_range_bits.
        * tree.c (get_pure_location): New function.
        (set_block): Use the pure form of the location for the
        caret in the combined location.
        (set_source_range): Likewise.

gcc/testsuite/ChangeLog:
        * gcc.dg/plugin/diagnostic_plugin_test_show_locus.c (get_loc): Add
        line_table param when calling
        linemap_position_for_line_and_column.
        * gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
        (emit_warning): Remove restriction that "loc" must be ad-hoc.

libcpp/ChangeLog:
        * include/line-map.h (source_location): Update the descriptive
        comment to reflect the packing scheme for short ranges.
        (struct line_map_ordinary): Drop field "column_bits" in favor
        of field "m_column_and_range_bits"; add field "m_range_bits".
        (ORDINARY_MAP_NUMBER_OF_COLUMN_BITS): Delete.
        (struct line_maps): Add fields "default_range_bits",
        "num_optimized_ranges" and "num_unoptimized_ranges".
        (get_range_from_adhoc_loc): Delete prototype.
        (get_range_from_loc): Convert from an inline function to a
        prototype.
        (pure_location_p): New prototype.
        (SOURCE_LINE): Update for renaming of column_bits.
        (SOURCE_COLUMN): Likewise.  Shift the column right by the map's
        range_bits.
        (LAST_SOURCE_LINE_LOCATION): Update for renaming of column_bits.
        (linemap_position_for_line_and_column): Add line_maps * params.
        * lex.c (_cpp_lex_direct): Don't attempt to record token ranges
        for UNKNOWN_LOCATION and BUILTINS_LOCATION.
        * line-map.c (LINE_MAP_MAX_COLUMN_NUMBER): Reduce from 1U << 17 to
        1U << 9.
        (can_be_stored_compactly_p): New function.
        (get_combined_adhoc_loc): Implement bit-packing scheme for short
        ranges.
        (get_range_from_adhoc_loc): Make static.
        (get_range_from_loc): New function.
        (pure_location_p): New function.
        (linemap_add): Ensure that start_location has zero for the
        range_bits, unless we're past LINE_MAP_MAX_LOCATION_WITH_COLS.
        Initialize range_bits to zero.  Assert that the start_location
        is "pure".
        (linemap_line_start): Assert that the
        column_and_range_bits >= range_bits.
        Update determinination of whether we need to start a new map
        using the effective column bits, without the range bits.
        Use the set's default_range_bits in new maps, apart from
        those with column_bits == 0, which should also have 0 range_bits.
        Increase the column bits for new maps by the range bits.
        When adding lines to an existing map, use set->highest_line
        directly rather than offsetting highest by SOURCE_COLUMN.
        Add assertions to sanity-check the return value.
        (linemap_position_for_column): Offset to_column by range_bits.
        Update set->hightest_location if necessary.
        (linemap_position_for_line_and_column): Add line_maps * param.
        Update the calculation to offset the column by range_bits, and
        conditionalize it on being <= LINE_MAP_MAX_LOCATION_WITH_COLS.
        Bound it by LINEMAPS_MACRO_LOWEST_LOCATION.  Update
        set->highest_location if necessary.
        (linemap_position_for_loc_and_offset): Pass "set" to
        linemap_position_for_line_and_column.
        * location-example.txt: Regenerate, showing new representation.
---
 gcc/ada/gcc-interface/trans.c                      |   3 +-
 gcc/input.c                                        |  28 ++-
 .../plugin/diagnostic_plugin_test_show_locus.c     |   3 +-
 .../diagnostic_plugin_test_tree_expression_range.c |   8 +-
 gcc/toplev.c                                       |   1 +
 gcc/tree.c                                         |  25 ++-
 libcpp/include/line-map.h                          | 121 +++++++----
 libcpp/lex.c                                       |   9 +-
 libcpp/line-map.c                                  | 229 +++++++++++++++++++--
 libcpp/location-example.txt                        | 188 +++++++++--------
 10 files changed, 450 insertions(+), 165 deletions(-)

diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
index f1e2dcb..c3ff66a 100644
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -9618,7 +9618,8 @@ Sloc_to_locus (Source_Ptr Sloc, location_t *locus, bool 
clear_column)
     line = 1;
 
   /* Translate the location.  */
-  *locus = linemap_position_for_line_and_column (map, line, column);
+  *locus = linemap_position_for_line_and_column (line_table, map,
+                                                line, column);
 
   return true;
 }
diff --git a/gcc/input.c b/gcc/input.c
index baf8e7e..6aae857 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -878,6 +878,10 @@ dump_line_table_statistics (void)
           STAT_LABEL (s.adhoc_table_size));
   fprintf (stderr, "Ad-hoc table entries used:           %5ld\n",
           s.adhoc_table_entries_used);
+  fprintf (stderr, "optimized_ranges: %i\n",
+          line_table->num_optimized_ranges);
+  fprintf (stderr, "unoptimized_ranges: %i\n",
+          line_table->num_unoptimized_ranges);
 
   fprintf (stderr, "\n");
 }
@@ -908,13 +912,14 @@ write_digit (FILE *stream, int digit)
 
 static void
 write_digit_row (FILE *stream, int indent,
+                const line_map_ordinary *map,
                 source_location loc, int max_col, int divisor)
 {
   fprintf (stream, "%*c", indent, ' ');
   fprintf (stream, "|");
   for (int column = 1; column < max_col; column++)
     {
-      source_location column_loc = loc + column;
+      source_location column_loc = loc + (column << map->m_range_bits);
       write_digit (stream, column_loc / divisor);
     }
   fprintf (stream, "\n");
@@ -968,14 +973,20 @@ dump_location_info (FILE *stream)
       fprintf (stream, "  file: %s\n", ORDINARY_MAP_FILE_NAME (map));
       fprintf (stream, "  starting at line: %i\n",
               ORDINARY_MAP_STARTING_LINE_NUMBER (map));
+      fprintf (stream, "  column and range bits: %i\n",
+              map->m_column_and_range_bits);
       fprintf (stream, "  column bits: %i\n",
-              ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map));
+              map->m_column_and_range_bits - map->m_range_bits);
+      fprintf (stream, "  range bits: %i\n",
+              map->m_range_bits);
 
       /* Render the span of source lines that this "map" covers.  */
       for (source_location loc = MAP_START_LOCATION (map);
           loc < end_location;
-          loc++)
+          loc += (1 << map->m_range_bits) )
        {
+         gcc_assert (pure_location_p (line_table, loc) );
+
          expanded_location exploc
            = linemap_expand_location (line_table, map, loc);
 
@@ -999,8 +1010,7 @@ dump_location_info (FILE *stream)
                 Render the locations *within* the line, by underlining
                 it, showing the source_location numeric values
                 at each column.  */
-             int max_col
-               = (1 << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)) - 1;
+             int max_col = (1 << map->m_column_and_range_bits) - 1;
              if (max_col > line_size)
                max_col = line_size + 1;
 
@@ -1008,17 +1018,17 @@ dump_location_info (FILE *stream)
 
              /* Thousands.  */
              if (end_location > 999)
-               write_digit_row (stream, indent, loc, max_col, 1000);
+               write_digit_row (stream, indent, map, loc, max_col, 1000);
 
              /* Hundreds.  */
              if (end_location > 99)
-               write_digit_row (stream, indent, loc, max_col, 100);
+               write_digit_row (stream, indent, map, loc, max_col, 100);
 
              /* Tens.  */
-             write_digit_row (stream, indent, loc, max_col, 10);
+             write_digit_row (stream, indent, map, loc, max_col, 10);
 
              /* Units.  */
-             write_digit_row (stream, indent, loc, max_col, 1);
+             write_digit_row (stream, indent, map, loc, max_col, 1);
            }
        }
       fprintf (stream, "\n");
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c 
b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
index 4c6120d..14a8d91 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
@@ -109,7 +109,8 @@ get_loc (unsigned int line_num, unsigned int col_num)
 
   /* Convert from 0-based column numbers to 1-based column numbers.  */
   source_location loc
-    = linemap_position_for_line_and_column (line_map,
+    = linemap_position_for_line_and_column (line_table,
+                                           line_map,
                                            line_num, col_num + 1);
 
   return loc;
diff --git 
a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c 
b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
index ca54278..89cc95a 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_tree_expression_range.c
@@ -36,13 +36,7 @@ int plugin_is_GPL_compatible;
 static void
 emit_warning (location_t loc)
 {
-  if (!IS_ADHOC_LOC (loc))
-    {
-      error_at (loc, "ad-hoc location not found");
-      return;
-    }
-
-  source_range src_range = get_range_from_adhoc_loc (line_table, loc);
+  source_range src_range = get_range_from_loc (line_table, loc);
   warning_at (loc, 0,
              "tree range %i:%i-%i:%i",
              LOCATION_LINE (src_range.m_start),
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 6d740d4..7067d96 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1147,6 +1147,7 @@ general_init (const char *argv0, bool init_signals)
   linemap_init (line_table, BUILTINS_LOCATION);
   line_table->reallocator = realloc_for_line_map;
   line_table->round_alloc_size = ggc_round_alloc_size;
+  line_table->default_range_bits = 5;
   init_ttree ();
 
   /* Initialize register usage now so switches may override.  */
diff --git a/gcc/tree.c b/gcc/tree.c
index a676352..4ec4a38 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -13653,11 +13653,31 @@ nonnull_arg_p (const_tree arg)
   return false;
 }
 
+static location_t
+get_pure_location (location_t loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    loc
+      = line_table->location_adhoc_data_map.data[loc & 
MAX_SOURCE_LOCATION].locus;
+
+  if (loc >= LINEMAPS_MACRO_LOWEST_LOCATION (line_table))
+    return loc;
+
+  if (loc < RESERVED_LOCATION_COUNT)
+    return loc;
+
+  const line_map *map = linemap_lookup (line_table, loc);
+  const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+
+  return loc & ~((1 << ordmap->m_range_bits) - 1);
+}
+
 location_t
 set_block (location_t loc, tree block)
 {
+  location_t pure_loc = get_pure_location (loc);
   source_range src_range = get_range_from_loc (line_table, loc);
-  return COMBINE_LOCATION_DATA (line_table, loc, src_range, block);
+  return COMBINE_LOCATION_DATA (line_table, pure_loc, src_range, block);
 }
 
 void
@@ -13675,8 +13695,9 @@ set_source_range (tree expr, source_range src_range)
   if (!EXPR_P (expr))
     return;
 
+  location_t pure_loc = get_pure_location (EXPR_LOCATION (expr));
   location_t adhoc = COMBINE_LOCATION_DATA (line_table,
-                                           EXPR_LOCATION (expr),
+                                           pure_loc,
                                            src_range,
                                            NULL);
   SET_EXPR_LOCATION (expr, adhoc);
diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index 0ef29d9..1a2dab8 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -47,7 +47,8 @@ enum lc_reason
 typedef unsigned int linenum_type;
 
 /* The typedef "source_location" is a key within the location database,
-   identifying a source location or macro expansion.
+   identifying a source location or macro expansion, along with range
+   information, and (optionally) a pointer for use by gcc.
 
    This key only has meaning in relation to a line_maps instance.  Within
    gcc there is a single line_maps instance: "line_table", declared in
@@ -69,13 +70,48 @@ typedef unsigned int linenum_type;
              |  ordmap[0]->start_location)   | first line in ordmap 0
   -----------+-------------------------------+-------------------------------
              | ordmap[1]->start_location     | First line in ordmap 1
-             | ordmap[1]->start_location+1   | First column in that line
-             | ordmap[1]->start_location+2   | 2nd column in that line
-             |                               | Subsequent lines are offset by
-             |                               | (1 << column_bits),
-             |                               | e.g. 128 for 7 bits, with a
-             |                               | column value of 0 representing
-             |                               | "the whole line".
+             | ordmap[1]->start_location+32  | First column in that line
+             |   (assuming range_bits == 5)  |
+             | ordmap[1]->start_location+64  | 2nd column in that line
+             | ordmap[1]->start_location+4096| Second line in ordmap 1
+             |   (assuming column_bits == 12)
+             |
+             |   Subsequent lines are offset by (1 << column_bits),
+             |   e.g. 4096 for 12 bits, with a column value of 0 representing
+             |   "the whole line".
+             |
+             |   Within a line, the low "range_bits" (typically 5) are used for
+             |   storing short ranges, so that there's an offset of
+             |     (1 << range_bits) between individual columns within a line,
+             |   typically 32.
+             |   The low range_bits store the offset of the end point from the
+             |   start point, and the start point is found by masking away
+             |   the range bits.
+             |
+             |   For example:
+             |      ordmap[1]->start_location+64    "2nd column in that line"
+             |   above means a caret at that location, with a range
+             |   starting and finishing at the same place (the range bits
+             |   are 0), a range of length 1.
+             |
+             |   By contrast:
+             |      ordmap[1]->start_location+68
+             |   has range bits 0x4, meaning a caret with a range starting at
+             |   that location, but with endpoint 4 columns further on: a range
+             |   of length 5.
+             |
+             |   Ranges that have caret != start, or have an endpoint too
+             |   far away to fit in range_bits are instead stored as ad-hoc
+             |   locations.  Hence for range_bits == 5 we can compactly store
+             |   tokens of length <= 32 without needing to use the ad-hoc
+             |   table.
+             |
+             |   This packing scheme means we effectively have
+             |     (column_bits - range_bits)
+             |   of bits for the columns, typically (12 - 5) = 7, for 128
+             |   columns; longer line widths are accomodated by starting a
+             |   new ordmap with a higher column_bits.
+             |
              | ordmap[2]->start_location-1   | Final location in ordmap 1
   -----------+-------------------------------+-------------------------------
              | ordmap[2]->start_location     | First line in ordmap 2
@@ -205,8 +241,9 @@ struct GTY((tag ("0"), desc ("%h.reason == LC_ENTER_MACRO ? 
2 : 1"))) line_map {
    
    Physical source file TO_FILE at line TO_LINE at column 0 is represented
    by the logical START_LOCATION.  TO_LINE+L at column C is represented by
-   START_LOCATION+(L*(1<<column_bits))+C, as long as C<(1<<column_bits),
-   and the result_location is less than the next line_map's start_location.
+   START_LOCATION+(L*(1<<m_column_and_range_bits))+(C*1<<m_range_bits), as
+   long as C<(1<<effective range bits), and the result_location is less than
+   the next line_map's start_location.
    (The top line is line 1 and the leftmost column is column 1; line/column 0
    means "entire file/line" or "unknown line/column" or "not applicable".)
 
@@ -226,8 +263,24 @@ struct GTY((tag ("1"))) line_map_ordinary : public 
line_map {
      cpp_buffer.  */
   unsigned char sysp;
 
-  /* Number of the low-order source_location bits used for a column number.  */
-  unsigned int column_bits : 8;
+  /* Number of the low-order source_location bits used for column numbers
+     and ranges.  */
+  unsigned int m_column_and_range_bits : 8;
+
+  /* Number of the low-order "column" bits used for storing short ranges
+     inline, rather than in the ad-hoc table.
+     MSB                                                                 LSB
+     31                                                                    0
+     +-------------------------+-------------------------------------------+
+     |                         |<---map->column_and_range_bits (e.g. 12)-->|
+     +-------------------------+-----------------------+-------------------+
+     |                         | column_and_range_bits | map->range_bits   |
+     |                         |   - range_bits        |                   |
+     +-------------------------+-----------------------+-------------------+
+     | row bits                | effective column bits | short range bits  |
+     |                         |    (e.g. 7)           |   (e.g. 5)        |
+     +-------------------------+-----------------------+-------------------+ */
+  unsigned int m_range_bits : 8;
 };
 
 /* This is the highest possible source location encoded within an
@@ -423,15 +476,6 @@ ORDINARY_MAP_IN_SYSTEM_HEADER_P (const line_map_ordinary 
*ord_map)
   return ord_map->sysp;
 }
 
-/* Get the number of the low-order source_location bits used for a
-   column number within ordinary map MAP.  */
-
-inline unsigned char
-ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (const line_map_ordinary *ord_map)
-{
-  return ord_map->column_bits;
-}
-
 /* Get the filename of ordinary map MAP.  */
 
 inline const char *
@@ -578,6 +622,12 @@ struct GTY(()) line_maps {
 
   /* True if we've seen a #line or # 44 "file" directive.  */
   bool seen_line_directive;
+
+  /* The default value of range_bits in ordinary line maps.  */
+  unsigned int default_range_bits;
+
+  unsigned int num_optimized_ranges;
+  unsigned int num_unoptimized_ranges;
 };
 
 /* Returns the number of allocated maps so far. MAP_KIND shall be TRUE
@@ -821,8 +871,10 @@ extern source_location get_combined_adhoc_loc (struct 
line_maps *,
 extern void *get_data_from_adhoc_loc (struct line_maps *, source_location);
 extern source_location get_location_from_adhoc_loc (struct line_maps *,
                                                    source_location);
-extern source_range get_range_from_adhoc_loc (struct line_maps *,
-                                             source_location);
+
+extern source_range
+get_range_from_loc (line_maps *set,
+                   source_location loc);
 
 /* Get whether location LOC is an ad-hoc location.  */
 
@@ -832,15 +884,11 @@ IS_ADHOC_LOC (source_location loc)
   return (loc & MAX_SOURCE_LOCATION) != loc;
 }
 
-inline source_range
-get_range_from_loc (struct line_maps *set,
-                   source_location loc)
-{
-  if (IS_ADHOC_LOC (loc))
-    return get_range_from_adhoc_loc (set, loc);
-  else
-    return source_range::from_location (loc);
-}
+/* Get whether location LOC is a "pure" location, or
+   whether it is an ad-hoc location, or embeds range information.  */
+
+bool
+pure_location_p (line_maps *set, source_location loc);
 
 /* Combine LOC and BLOCK, giving a combined adhoc location.  */
 
@@ -936,7 +984,7 @@ inline linenum_type
 SOURCE_LINE (const line_map_ordinary *ord_map, source_location loc)
 {
   return ((loc - ord_map->start_location)
-         >> ord_map->column_bits) + ord_map->to_line;
+         >> ord_map->m_column_and_range_bits) + ord_map->to_line;
 }
 
 /* Convert a map and source_location to source column number.  */
@@ -944,7 +992,7 @@ inline linenum_type
 SOURCE_COLUMN (const line_map_ordinary *ord_map, source_location loc)
 {
   return ((loc - ord_map->start_location)
-         & ((1 << ord_map->column_bits) - 1));
+         & ((1 << ord_map->m_column_and_range_bits) - 1)) >> 
ord_map->m_range_bits;
 }
 
 /* Return the location of the last source line within an ordinary
@@ -954,7 +1002,7 @@ LAST_SOURCE_LINE_LOCATION (const line_map_ordinary *map)
 {
   return (((map[1].start_location - 1
            - map->start_location)
-          & ~((1 << map->column_bits) - 1))
+          & ~((1 << map->m_column_and_range_bits) - 1))
          + map->start_location);
 }
 
@@ -1004,7 +1052,8 @@ linemap_position_for_column (struct line_maps *, unsigned 
int);
 /* Encode and return a source location from a given line and
    column.  */
 source_location
-linemap_position_for_line_and_column (const line_map_ordinary *,
+linemap_position_for_line_and_column (line_maps *set,
+                                     const line_map_ordinary *,
                                      linenum_type, unsigned int);
 
 /* Encode and return a source_location starting from location LOC and
diff --git a/libcpp/lex.c b/libcpp/lex.c
index f4c964f..6f46a7f 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -2725,9 +2725,12 @@ _cpp_lex_direct (cpp_reader *pfile)
 
   source_range tok_range;
   tok_range.m_start = result->src_loc;
-  tok_range.m_finish =
-    linemap_position_for_column (pfile->line_table,
-                                CPP_BUF_COLUMN (buffer, buffer->cur));
+  if (result->src_loc >= RESERVED_LOCATION_COUNT)
+    tok_range.m_finish =
+      linemap_position_for_column (pfile->line_table,
+                                  CPP_BUF_COLUMN (buffer, buffer->cur));
+  else
+    tok_range.m_finish = tok_range.m_start;
 
   result->src_loc = COMBINE_LOCATION_DATA (pfile->line_table,
                                           result->src_loc,
diff --git a/libcpp/line-map.c b/libcpp/line-map.c
index 6385fdf..fe8d784 100644
--- a/libcpp/line-map.c
+++ b/libcpp/line-map.c
@@ -29,7 +29,7 @@ along with this program; see the file COPYING3.  If not see
 /* Do not track column numbers higher than this one.  As a result, the
    range of column_bits is [7, 18] (or 0 if column numbers are
    disabled).  */
-const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 17);
+const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 9);
 
 /* Do not track column numbers if locations get higher than this.  */
 const source_location LINE_MAP_MAX_LOCATION_WITH_COLS = 0x60000000;
@@ -112,6 +112,49 @@ rebuild_location_adhoc_htab (struct line_maps *set)
                    set->location_adhoc_data_map.data + i, INSERT);
 }
 
+/* Helper function for get_combined_adhoc_loc.
+   Can the given LOCUS + SRC_RANGE and DATA pointer be stored compactly
+   within a source_location, without needing to use an ad-hoc location.  */
+
+static bool
+can_be_stored_compactly_p (struct line_maps *set,
+                          source_location locus,
+                          source_range src_range,
+                          void *data)
+{
+  /* If there's an ad-hoc pointer, we can't store it directly in the
+     source_location, we need the lookaside.  */
+  if (data)
+    return false;
+
+  /* We only store ranges that begin at the locus and that are sufficientl
+     "sane".  */
+  if (src_range.m_start != locus)
+    return false;
+
+  if (src_range.m_finish < src_range.m_start)
+    return false;
+
+  if (src_range.m_start < RESERVED_LOCATION_COUNT)
+    return false;
+
+  if (locus >= LINE_MAP_MAX_LOCATION_WITH_COLS)
+    return false;
+
+  /* All 3 locations must be within ordinary maps, typically, the same
+     ordinary map.  */
+  source_location lowest_macro_loc = LINEMAPS_MACRO_LOWEST_LOCATION (set);
+  if (locus >= lowest_macro_loc)
+    return false;
+  if (src_range.m_start >= lowest_macro_loc)
+    return false;
+  if (src_range.m_finish >= lowest_macro_loc)
+    return false;
+
+  /* Passed all tests.  */
+  return true;
+}
+
 /* Combine LOCUS and DATA to a combined adhoc loc.  */
 
 source_location
@@ -128,6 +171,60 @@ get_combined_adhoc_loc (struct line_maps *set,
       = set->location_adhoc_data_map.data[locus & MAX_SOURCE_LOCATION].locus;
   if (locus == 0 && data == NULL)
     return 0;
+
+  /* Any ordinary locations ought to be "pure" at this point: no
+     compressed ranges.  */
+  linemap_assert (locus < RESERVED_LOCATION_COUNT
+                 || locus >= LINE_MAP_MAX_LOCATION_WITH_COLS
+                 || locus >= LINEMAPS_MACRO_LOWEST_LOCATION (set)
+                 || pure_location_p (set, locus));
+
+#define DEBUG_PACKING 0
+
+#if DEBUG_PACKING
+  fprintf (stderr, "get_combined_adhoc_loc: %x %x %x\n",
+          locus, src_range.m_start, src_range.m_finish);
+#endif
+
+  /* Consider short-range optimization.  */
+  if (can_be_stored_compactly_p (set, locus, src_range, data))
+    {
+      /* The low bits ought to be clear.  */
+      linemap_assert (pure_location_p (set, locus));
+      const line_map *map = linemap_lookup (set, locus);
+      const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+      unsigned int int_diff = src_range.m_finish - src_range.m_start;
+      unsigned int col_diff = (int_diff >> ordmap->m_range_bits);
+      if (col_diff < (1U << ordmap->m_range_bits))
+       {
+         source_location packed = locus | col_diff;
+         set->num_optimized_ranges++;
+#if DEBUG_PACKING
+         fprintf (stderr, "  optimized to %x\n", packed);
+#endif
+         return packed;
+       }
+    }
+
+  /* We can also compactly store the reserved locations
+     when locus == start == finish (and data is NULL).  */
+  if (locus < RESERVED_LOCATION_COUNT
+      && locus == src_range.m_start
+      && locus == src_range.m_finish
+      && !data)
+    {
+#if DEBUG_PACKING
+      fprintf (stderr, "  using reserved location: %x\n", locus);
+#endif
+      return locus;
+    }
+
+#if DEBUG_PACKING
+  fprintf (stderr, "  unoptimized\n");
+#endif
+  if (!data)
+    set->num_unoptimized_ranges++;
+
   lb.locus = locus;
   lb.src_range = src_range;
   lb.data = data;
@@ -184,13 +281,58 @@ get_location_from_adhoc_loc (struct line_maps *set, 
source_location loc)
   return set->location_adhoc_data_map.data[loc & MAX_SOURCE_LOCATION].locus;
 }
 
-source_range
+static source_range
 get_range_from_adhoc_loc (struct line_maps *set, source_location loc)
 {
   linemap_assert (IS_ADHOC_LOC (loc));
   return set->location_adhoc_data_map.data[loc & 
MAX_SOURCE_LOCATION].src_range;
 }
 
+/* Get the source_range of location LOC, either from the ad-hoc
+   lookaside table, or embedded inside LOC itself.  */
+
+source_range
+get_range_from_loc (struct line_maps *set,
+                   source_location loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    return get_range_from_adhoc_loc (set, loc);
+
+  /* For ordinary maps, extract packed range.  */
+  if (loc >= RESERVED_LOCATION_COUNT
+      && loc < LINEMAPS_MACRO_LOWEST_LOCATION (set)
+      && loc <= LINE_MAP_MAX_LOCATION_WITH_COLS)
+    {
+      const line_map *map = linemap_lookup (set, loc);
+      const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+      source_range result;
+      int offset = loc & ((1 << ordmap->m_range_bits) - 1);
+      result.m_start = loc - offset;
+      result.m_finish = result.m_start + (offset << ordmap->m_range_bits);
+      return result;
+    }
+
+  return source_range::from_location (loc);
+}
+
+/* Get whether location LOC is a "pure" location, or
+   whether it is an ad-hoc location, or embeds range information.  */
+
+bool
+pure_location_p (line_maps *set, source_location loc)
+{
+  if (IS_ADHOC_LOC (loc))
+    return false;
+
+  const line_map *map = linemap_lookup (set, loc);
+  const line_map_ordinary *ordmap = linemap_check_ordinary (map);
+
+  if (loc & ((1U << ordmap->m_range_bits) - 1))
+    return false;
+
+  return true;
+}
+
 /* Finalize the location_adhoc_data structure.  */
 void
 location_adhoc_data_fini (struct line_maps *set)
@@ -333,7 +475,19 @@ const struct line_map *
 linemap_add (struct line_maps *set, enum lc_reason reason,
             unsigned int sysp, const char *to_file, linenum_type to_line)
 {
-  source_location start_location = set->highest_location + 1;
+  /* Generate a start_location above the current highest_location.
+     If possible, make the low range bits be zero.  */
+  source_location start_location;
+  if (set->highest_location < LINE_MAP_MAX_LOCATION_WITH_COLS)
+    {
+      start_location = set->highest_location + (1 << set->default_range_bits);
+      if (set->default_range_bits)
+       start_location &= ~((1 << set->default_range_bits) - 1);
+      linemap_assert (0 == (start_location
+                           & ((1 << set->default_range_bits) - 1)));
+    }
+  else
+    start_location = set->highest_location + 1;
 
   linemap_assert (!(LINEMAPS_ORDINARY_USED (set)
                    && (start_location
@@ -412,11 +566,18 @@ linemap_add (struct line_maps *set, enum lc_reason reason,
   map->to_file = to_file;
   map->to_line = to_line;
   LINEMAPS_ORDINARY_CACHE (set) = LINEMAPS_ORDINARY_USED (set) - 1;
-  map->column_bits = 0;
+  map->m_column_and_range_bits = 0;
+  map->m_range_bits = 0;
   set->highest_location = start_location;
   set->highest_line = start_location;
   set->max_column_hint = 0;
 
+  /* This assertion is placed after set->highest_location has
+     been updated, since the latter affects
+     linemap_location_from_macro_expansion_p, which ultimately affects
+     pure_location_p.  */
+  linemap_assert (pure_location_p (set, start_location));
+
   if (reason == LC_ENTER)
     {
       map->included_from =
@@ -563,13 +724,14 @@ linemap_line_start (struct line_maps *set, linenum_type 
to_line,
     SOURCE_LINE (map, set->highest_line);
   int line_delta = to_line - last_line;
   bool add_map = false;
+  linemap_assert (map->m_column_and_range_bits >= map->m_range_bits);
+  int effective_column_bits = map->m_column_and_range_bits - map->m_range_bits;
 
   if (line_delta < 0
       || (line_delta > 10
-         && line_delta * ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) > 1000)
-      || (max_column_hint >= (1U << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)))
-      || (max_column_hint <= 80
-         && ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) >= 10)
+         && line_delta * map->m_column_and_range_bits > 1000)
+      || (max_column_hint >= (1U << effective_column_bits))
+      || (max_column_hint <= 80 && effective_column_bits >= 10)
       || (highest > LINE_MAP_MAX_LOCATION_WITH_COLS
          && (set->max_column_hint || highest >= LINE_MAP_MAX_SOURCE_LOCATION)))
     add_map = true;
@@ -578,22 +740,27 @@ linemap_line_start (struct line_maps *set, linenum_type 
to_line,
   if (add_map)
     {
       int column_bits;
+      int range_bits;
       if (max_column_hint > LINE_MAP_MAX_COLUMN_NUMBER
          || highest > LINE_MAP_MAX_LOCATION_WITH_COLS)
        {
          /* If the column number is ridiculous or we've allocated a huge
-            number of source_locations, give up on column numbers. */
+            number of source_locations, give up on column numbers
+            (and on packed ranges).  */
          max_column_hint = 0;
          column_bits = 0;
+         range_bits = 0;
          if (highest > LINE_MAP_MAX_SOURCE_LOCATION)
            return 0;
        }
       else
        {
          column_bits = 7;
+         range_bits = set->default_range_bits;
          while (max_column_hint >= (1U << column_bits))
            column_bits++;
          max_column_hint = 1U << column_bits;
+         column_bits += range_bits;
        }
       /* Allocate the new line_map.  However, if the current map only has a
         single line we can sometimes just increase its column_bits instead. */
@@ -606,14 +773,14 @@ linemap_line_start (struct line_maps *set, linenum_type 
to_line,
                                ORDINARY_MAP_IN_SYSTEM_HEADER_P (map),
                                ORDINARY_MAP_FILE_NAME (map),
                                to_line)));
-      map->column_bits = column_bits;
+      map->m_column_and_range_bits = column_bits;
+      map->m_range_bits = range_bits;
       r = (MAP_START_LOCATION (map)
           + ((to_line - ORDINARY_MAP_STARTING_LINE_NUMBER (map))
              << column_bits));
     }
   else
-    r = highest - SOURCE_COLUMN (map, highest)
-      + (line_delta << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map));
+    r = set->highest_line + (line_delta << map->m_column_and_range_bits);
 
   /* Locations of ordinary tokens are always lower than locations of
      macro tokens.  */
@@ -624,6 +791,18 @@ linemap_line_start (struct line_maps *set, linenum_type 
to_line,
   if (r > set->highest_location)
     set->highest_location = r;
   set->max_column_hint = max_column_hint;
+
+  /* At this point, we expect one of:
+     (a) the normal case: a "pure" location with 0 range bits, or
+     (b) we've gone past LINE_MAP_MAX_LOCATION_WITH_COLS so can't track
+        columns anymore (or ranges), or
+     (c) we're in a region with a column hint exceeding
+        LINE_MAP_MAX_COLUMN_NUMBER, so column-tracking is off,
+       with column_bits == 0.  */
+  linemap_assert (pure_location_p (set, r)
+                 || r >= LINE_MAP_MAX_LOCATION_WITH_COLS
+                 || map->m_column_and_range_bits == 0);
+  linemap_assert (SOURCE_LINE (map, r) == to_line);
   return r;
 }
 
@@ -654,7 +833,8 @@ linemap_position_for_column (struct line_maps *set, 
unsigned int to_column)
          r = linemap_line_start (set, SOURCE_LINE (map, r), to_column + 50);
        }
     }
-  r = r + to_column;
+  line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP (set);
+  r = r + (to_column << map->m_range_bits);
   if (r >= set->highest_location)
     set->highest_location = r;
   return r;
@@ -664,16 +844,25 @@ linemap_position_for_column (struct line_maps *set, 
unsigned int to_column)
    column.  */
 
 source_location
-linemap_position_for_line_and_column (const line_map_ordinary *ord_map,
+linemap_position_for_line_and_column (line_maps *set,
+                                     const line_map_ordinary *ord_map,
                                      linenum_type line,
                                      unsigned column)
 {
   linemap_assert (ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map) <= line);
 
-  return (MAP_START_LOCATION (ord_map)
-         + ((line - ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map))
-            << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (ord_map))
-         + (column & ((1 << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (ord_map)) - 
1)));
+  source_location r = MAP_START_LOCATION (ord_map);
+  r += ((line - ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map))
+       << ord_map->m_column_and_range_bits);
+  if (r <= LINE_MAP_MAX_LOCATION_WITH_COLS)
+    r += ((column & ((1 << ord_map->m_column_and_range_bits) - 1))
+         << ord_map->m_range_bits);
+  source_location upper_limit = LINEMAPS_MACRO_LOWEST_LOCATION (set);
+  if (r >= upper_limit)
+    r = upper_limit - 1;
+  if (r > set->highest_location)
+    set->highest_location = r;
+  return r;
 }
 
 /* Encode and return a source_location starting from location LOC and
@@ -728,11 +917,11 @@ linemap_position_for_loc_and_offset (struct line_maps 
*set,
     }
 
   offset += column;
-  if (linemap_assert_fails (offset < (1u << map->column_bits)))
+  if (linemap_assert_fails (offset < (1u << map->m_column_and_range_bits)))
     return loc;
 
   source_location r = 
-    linemap_position_for_line_and_column (map, line, offset);
+    linemap_position_for_line_and_column (set, map, line, offset);
   if (linemap_assert_fails (r <= set->highest_location)
       || linemap_assert_fails (map == linemap_lookup (set, r)))
     return loc;
diff --git a/libcpp/location-example.txt b/libcpp/location-example.txt
index a5f95b2..14b5c2e 100644
--- a/libcpp/location-example.txt
+++ b/libcpp/location-example.txt
@@ -30,142 +30,154 @@ RESERVED LOCATIONS
   source_location interval: 0 <= loc < 2
 
 ORDINARY MAP: 0
-  source_location interval: 2 <= loc < 3
+  source_location interval: 32 <= loc < 64
   file: test.c
   starting at line: 1
-  column bits: 7
-test.c:  1|loc:    2|#include "test.h"
-                    |00000001111111111
-                    |34567890123456789
+  column bits: 12
+  range bits: 5
+test.c:  1|loc:   32|#include "test.h"
+                    |69269258258148147
+                    |46802468024680246
 
 ORDINARY MAP: 1
-  source_location interval: 3 <= loc < 4
+  source_location interval: 64 <= loc < 96
   file: <built-in>
   starting at line: 0
   column bits: 0
+  range bits: 0
 
 ORDINARY MAP: 2
-  source_location interval: 4 <= loc < 5
+  source_location interval: 96 <= loc < 128
   file: <command-line>
   starting at line: 0
   column bits: 0
+  range bits: 0
 
 ORDINARY MAP: 3
-  source_location interval: 5 <= loc < 5005
+  source_location interval: 128 <= loc < 160128
   file: /usr/include/stdc-predef.h
   starting at line: 1
-  column bits: 7
+  column bits: 12
+  range bits: 5
 (contents of /usr/include/stdc-predef.h snipped for brevity)
 
 ORDINARY MAP: 4
-  source_location interval: 5005 <= loc < 5006
+  source_location interval: 160128 <= loc < 160160
   file: <command-line>
-  starting at line: 1
-  column bits: 7
+  starting at line: 32
+  column bits: 12
+  range bits: 5
 
 ORDINARY MAP: 5
-  source_location interval: 5006 <= loc < 5134
+  source_location interval: 160160 <= loc < 164256
   file: test.c
   starting at line: 1
-  column bits: 7
-test.c:  1|loc: 5006|#include "test.h"
-                    |55555555555555555
+  column bits: 12
+  range bits: 5
+test.c:  1|loc:160160|#include "test.h"
                     |00000000000000000
-                    |00011111111112222
-                    |78901234567890123
+                    |12223334445556667
+                    |92582581481470470
+                    |24680246802468024
 
 ORDINARY MAP: 6
-  source_location interval: 5134 <= loc < 5416
+  source_location interval: 164256 <= loc < 173280
   file: test.h
   starting at line: 1
-  column bits: 7
-test.h:  1|loc: 5134|extern int foo ();
-                    |555555555555555555
-                    |111111111111111111
-                    |333334444444444555
-                    |567890123456789012
-test.h:  2|loc: 5262|
+  column bits: 12
+  range bits: 5
+test.h:  1|loc:164256|extern int foo ();
+                    |444444444444444444
+                    |233344455566677788
+                    |825814814704703603
+                    |802468024680246802
+test.h:  2|loc:168352|
                     |
                     |
                     |
                     |
-test.h:  3|loc: 5390|#define PLUS(A, B) A + B
-                    |555555555555555555555555
-                    |333333333444444444444444
-                    |999999999000000000011111
-                    |123456789012345678901234
+test.h:  3|loc:172448|#define PLUS(A, B) A + B
+                    |222222222222222223333333
+                    |455566677788889990001112
+                    |814704703603692692582581
+                    |024680246802468024680246
 
 ORDINARY MAP: 7
-  source_location interval: 5416 <= loc < 6314
+  source_location interval: 173280 <= loc < 202016
   file: test.c
   starting at line: 2
-  column bits: 7
-test.c:  2|loc: 5416|
+  column bits: 12
+  range bits: 5
+test.c:  2|loc:173280|
                     |
                     |
                     |
                     |
-test.c:  3|loc: 5544|int
-                    |555
-                    |555
+test.c:  3|loc:177376|int
+                    |777
                     |444
-                    |567
-test.c:  4|loc: 5672|main (int argc, char **argv)
-                    |5555555555555555555555555555
-                    |6666666666666666666666666667
-                    |7777777888888888899999999990
-                    |3456789012345678901234567890
-test.c:  5|loc: 5800|{
+                    |047
+                    |802
+test.c:  4|loc:181472|main (int argc, char **argv)
+                    |1111111111111111222222222222
+                    |5556666777888999000111222333
+                    |0360369269258258148147047036
+                    |4680246802468024680246802468
+test.c:  5|loc:185568|{
                     |5
-                    |8
-                    |0
-                    |1
-test.c:  6|loc: 5928|  int a = PLUS (1,2);
-                    |555555555555555555555
-                    |999999999999999999999
-                    |233333333334444444444
-                    |901234567890123456789
-test.c:  7|loc: 6056|  int b = PLUS (3,4);
-                    |666666666666666666666
-                    |000000000000000000000
-                    |555666666666677777777
-                    |789012345678901234567
-test.c:  8|loc: 6184|  return 0;
-                    |66666666666
-                    |11111111111
-                    |88888999999
-                    |56789012345
-test.c:  9|loc: 6312|}
                     |6
-                    |3
+                    |0
+                    |0
+test.c:  6|loc:189664|  int a = PLUS (1,2);
+                    |999999999900000000000
+                    |677788899900011122233
+                    |926925825814814704703
+                    |680246802468024680246
+test.c:  7|loc:193760|  int b = PLUS (3,4);
+                    |333333344444444444444
+                    |788899900011122233344
+                    |925825814814704703603
+                    |246802468024680246802
+test.c:  8|loc:197856|  return 0;
+                    |77778888888
+                    |89990001112
+                    |82581481470
+                    |80246802468
+test.c:  9|loc:201952|}
                     |1
-                    |3
+                    |9
+                    |8
+                    |4
 
 UNALLOCATED LOCATIONS
-  source_location interval: 6314 <= loc < 2147483633
+  source_location interval: 202016 <= loc < 2147483633
 
 MACRO 1: PLUS (7 tokens)
   source_location interval: 2147483633 <= loc < 2147483640
-test.c:7:11: note: expansion point is location 6067
+test.c:7:11: note: expansion point is location 194115
    int b = PLUS (3,4);
-           ^
+           ^~~~
+
   map->start_location: 2147483633
   macro_locations:
-    0: 6073, 5410
-test.c:7:17: note: token 0 has x-location == 6073
+    0: 194304, 173088
+test.c:7:17: note: token 0 has x-location == 194304
    int b = PLUS (3,4);
                  ^
-test.c:7:17: note: token 0 has y-location == 5410
-    1: 5412, 5412
+
+test.c:7:17: note: token 0 has y-location == 173088
+    1: 173152, 173152
 In file included from test.c:1:0:
-test.h:3:22: note: token 1 has x-location == y-location == 5412
+test.h:3:22: note: token 1 has x-location == y-location == 173152
  #define PLUS(A, B) A + B
                       ^
-    2: 6075, 5414
-test.c:7:19: note: token 2 has x-location == 6075
+
+    2: 194368, 173216
+test.c:7:19: note: token 2 has x-location == 194368
    int b = PLUS (3,4);
                    ^
-test.c:7:19: note: token 2 has y-location == 5414
+
+test.c:7:19: note: token 2 has y-location == 173216
     3: 0, 2947526575
 cc1: note: token 3 has x-location == 0
 cc1: note: token 3 has y-location == 2947526575
@@ -178,26 +190,30 @@ x-location == y-location == 2947526575 encodes token # 
800042942
 
 MACRO 0: PLUS (7 tokens)
   source_location interval: 2147483640 <= loc < 2147483647
-test.c:6:11: note: expansion point is location 5939
+test.c:6:11: note: expansion point is location 190019
    int a = PLUS (1,2);
-           ^
+           ^~~~
+
   map->start_location: 2147483640
   macro_locations:
-    0: 5945, 5410
-test.c:6:17: note: token 0 has x-location == 5945
+    0: 190208, 173088
+test.c:6:17: note: token 0 has x-location == 190208
    int a = PLUS (1,2);
                  ^
-test.c:6:17: note: token 0 has y-location == 5410
-    1: 5412, 5412
+
+test.c:6:17: note: token 0 has y-location == 173088
+    1: 173152, 173152
 In file included from test.c:1:0:
-test.h:3:22: note: token 1 has x-location == y-location == 5412
+test.h:3:22: note: token 1 has x-location == y-location == 173152
  #define PLUS(A, B) A + B
                       ^
-    2: 5947, 5414
-test.c:6:19: note: token 2 has x-location == 5947
+
+    2: 190272, 173216
+test.c:6:19: note: token 2 has x-location == 190272
    int a = PLUS (1,2);
                    ^
-test.c:6:19: note: token 2 has y-location == 5414
+
+test.c:6:19: note: token 2 has y-location == 173216
     3: 0, 2947526575
 cc1: note: token 3 has x-location == 0
 cc1: note: token 3 has y-location == 2947526575
-- 
1.8.5.3

Reply via email to