On Mon, Nov 12, 2018 at 10:30 PM Michael Powell <[email protected]> wrote:
> On Mon, Nov 12, 2018 at 12:46 PM Michael Powell <[email protected]> > wrote: > > > > On Mon, Nov 12, 2018 at 10:06 AM Michael Powell <[email protected]> > wrote: > > > > > > Hello, > > > > > > Another question following up, how about the sign character for hex > > > and oct integers? Is it necessary, should it be discarded? > > > > > > intLit = decimalLit | octalLit | hexLit > > > decimalLit = ( "1" … "9" ) { decimalDigit } > > > octalLit = "0" { octalDigit } > > > hexLit = "0" ( "x" | "X" ) hexDigit { hexDigit } > > > > > > constant = fullIdent | ( [ "-" | "+" ] intLit ) | ( [ "-" | "+" ] > > > floatLit ) | strLit | boolLit > > > > > > > https://developers.google.com/protocol-buffers/docs/reference/proto2-spec#integer_literals > > > > https://developers.google.com/protocol-buffers/docs/reference/proto2-spec#constant > > > > > > For instance, I am fairly certain the sign character is encoded in a > > > hex encoded integer. Not sure about octal, but I imagine that it is > > > fairly consistent. > > Got it sorted out I believe. Actually, it's quite nice the parser > support Spirit provides, aligns pretty much perfectly with the grammar > specification. There's a bit of gymnastics involved juggling whether > the AST has a sign or not and so forth, but other than that, it flows > well enough. > If you haven't already, take a look at descriptor.proto <https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/descriptor.proto> -- FileDescriptorProto <https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/descriptor.proto#L61> therein is basically like an AST for the proto language (and is what protoc produces as it parses). And for parsing options and the literal values in particular, take a look at UninterpretedOption <https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/descriptor.proto#L701>. Options are first parsed into this structure, and then "interpreted" into the attributes of *Options messages in a second pass. You'll see that the approach there includes the negation in the literal integer value but also distinguishes between the two <https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/descriptor.proto#L716> in the AST. > > > > Case in point, the value 107026150751750362 gets encoded as > > > 0X17C3BB7913C48DA (upper-case). Whereas it's negative counterpart, > > > -107026150751750362, really does get encoded as 0xFE83C4486EC3B726. > > > Signage included, if memory serves. > > > > > > In these cases, I think the sign bit falls in the "optional" category? > > > > So... As far as I can determine, there are a couple of ways to > > interpret this, semantically speaking. But this potentially informs > > whatever parsing stack you are using as well. > > > > I'm using Boost Spirit Qi, for instance, which supports radix-based > > integer parsing well enough, but has its own set of issues when > > dealing with signage. That being said... > > > > 1. Treat the value itself as positive one way or another, with an > > optional sign attribute (i.e. '+' or '-'). This would potentially > > work, especially when there is base 16 (hex) or base 8 (octal) > > involved. > > > > 2. Otherwise, open to suggestions, but for Qi constraints; that I know > > of, fails to parse negative signed hexadecimal/octal encoded values. > > > > Again, kind of a symptom of an imprecise grammar specification. I can > > get a sense for how to handle it, but does it truly capture "intent". > > > > Thanks in advance for any light that can be shed. > > > > > Cheers, thanks, > > > > > > Michael > > > On Sun, Nov 11, 2018 at 10:56 AM Josh Humphries <[email protected]> > wrote: > > > > > > > > For the case of zero by itself, per the spec, it will be parsed as > an octal literal with value zero -- so functionally equivalent to a decimal > literal with value zero. And for values with multiple digits, a leading > zero means it is an octal literal. Decimal values will not have a leading > zero. > > > > > > > > ---- > > > > Josh Humphries > > > > [email protected] > > > > > > > > > > > > On Sat, Nov 10, 2018 at 10:16 PM Michael Powell < > [email protected]> wrote: > > > >> > > > >> Hello, > > > >> > > > >> I think 0 can be a decimal-lit, don't you think? However, the spec > > > >> reads as follows: > > > >> > > > >> intLit = decimalLit | octalLit | hexLit > > > >> decimalLit = ( "1" … "9" ) { decimalDigit } > > > >> octalLit = "0" { octalDigit } > > > >> hexLit = "0" ( "x" | "X" ) hexDigit { hexDigit } > > > >> > > > >> Is there a reason, semantically speaking, why decimal must be > greater > > > >> than 0? And that's not including a plus/minus sign when you factor > in > > > >> constants. > > > >> > > > >> Of course, parsing, order matters, similar as with the escape > > > >> character phrases in the string-literal: > > > >> > > > >> hex-lit | oct-lit | dec-lit > > > >> > > > >> And so on, since you have to rule out 0x\d+ for hex, followed by > 0\d* ... > > > >> > > > >> Actually, now that I look at it "0" (really, "decimal" 0) is lurking > > > >> in the oct-lit phrase. > > > >> > > > >> Kind of a grammatical nit-pick, I know, but I just wanted to be > clear > > > >> here. Seems like a possible source of confusion if you aren't paying > > > >> careful attention. > > > >> > > > >> Thoughts? > > > >> > > > >> Best regards, > > > >> > > > >> Michael Powell > > > >> > > > >> -- > > > >> You received this message because you are subscribed to the Google > Groups "Protocol Buffers" group. > > > >> To unsubscribe from this group and stop receiving emails from it, > send an email to [email protected]. > > > >> To post to this group, send email to [email protected]. > > > >> Visit this group at https://groups.google.com/group/protobuf. > > > >> For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "Protocol Buffers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/protobuf. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/protobuf. For more options, visit https://groups.google.com/d/optout.
