davidzollo commented on PR #10447:
URL: https://github.com/apache/seatunnel/pull/10447#issuecomment-3941037673

   There are two critical functional bugs that must be fixed before this can be 
merged,
   plus several smaller issues.
   
   ---
   
   ##  [BLOCKER] access_token is declared but never injected into HTTP headers
   
   The `access_token` option is marked as required in 
`HubSpotSourceFactory.optionRule()`,
   but it is **never used anywhere in the actual HTTP request**. Every request 
sent to
   the real HubSpot API will return `401 Unauthorized`.
   
   The established pattern in this project is to override `buildWithConfig()` 
in a
   custom `HttpParameter` subclass and inject the token as an `Authorization: 
Bearer`
   header. See how the existing connectors do it:
   
   - `GithubSourceParameter.buildWithConfig()` → injects `Authorization: Bearer 
<token>`
   - `GitlabSourceParameter.buildWithConfig()` → injects `PRIVATE-TOKEN: 
<token>`
   - `KlaviyoSourceParameter.buildWithConfig()` → injects `Authorization: 
Klaviyo-API-Key <key>`
   
   **Current `HubSpotSourceParameter` (effectively empty):**
   ```java
   public class HubSpotSourceParameter extends HttpParameter {
       // Only holds a constant — buildWithConfig() is never overridden
       public static final String DEFAULT_CONTENT_FIELD = "$.results";
   }
   ```
   
   **Current `HubSpotSource` (createReader() not overridden):**
   ```java
   public class HubSpotSource extends HttpSource {
       public HubSpotSource(ReadonlyConfig config) {
           super(config);  // uses base HttpParameter — no auth header
       }
       // createReader() falls through to HttpSource, passing the plain 
httpParameter
   }
   ```
   
   **Required fix:**
   
   1. Override `buildWithConfig()` in `HubSpotSourceParameter` to inject the 
Bearer token:
   ```java
   @Override
   public void buildWithConfig(ReadonlyConfig pluginConfig) {
       super.buildWithConfig(pluginConfig);
       this.headers = this.getHeaders() == null ? new HashMap<>() : 
this.getHeaders();
       this.headers.put(
           "Authorization",
           "Bearer " + pluginConfig.get(HubSpotSourceOptions.ACCESS_TOKEN));
       this.setHeaders(this.headers);
   }
   ```
   
   2. Override `createReader()` in `HubSpotSource` to pass the custom parameter 
(see `GitlabSource` as reference):
   ```java
   private final HubSpotSourceParameter hubSpotSourceParameter = new 
HubSpotSourceParameter();
   
   public HubSpotSource(ReadonlyConfig config) {
       super(config);
       hubSpotSourceParameter.buildWithConfig(config);
   }
   
   @Override
   public AbstractSingleSplitReader<SeaTunnelRow> createReader(
           SingleSplitReaderContext readerContext) throws Exception {
       return new HttpSourceReader(
               this.hubSpotSourceParameter,
               readerContext,
               this.deserializationSchema,
               jsonField,
               contentField);
   }
   ```
   
   The E2E test currently passes only because MockServer does not validate the
   `Authorization` header. This bug would be invisible in CI but break 
immediately
   against the real HubSpot API.
   
   ---
   
   ##  [BLOCKER] object_type is never mapped to a URL
   
   The PR description states: *"Added dynamic configuration logic to map 
object_type
   to API URLs."* However, **no such mapping exists in the code**. 
`object_type` is
   only referenced in `HubSpotSourceOptions` and the unit test — it never 
influences
   the actual request URL.
   
   The E2E test works around this by hardcoding the URL directly:
   ```hocon
   # hubspot_source_case.conf
   url = "http://mock-server:1080/crm/v3/objects/contacts";  # bypasses 
object_type entirely
   ```
   
   Meanwhile the documentation example shows only `access_token` and 
`object_type`,
   with no `url` field — implying the URL is auto-generated, which it is not:
   ```hocon
   # docs/en/connectors/source/Hubspot.md
   source {
     HubSpot {
       access_token = "pat-na1-..."
       object_type = "contacts"       # user expects this to be enough, but url 
is also required
     }
   }
   ```
   
   **Required fix:** Add URL construction logic in 
`HubSpotSourceParameter.buildWithConfig()`:
   ```java
   private static final String HUBSPOT_BASE_URL = 
"https://api.hubapi.com/crm/v3/objects/";;
   
   // Build URL from object_type when url is not explicitly provided
   if (this.getUrl() == null || this.getUrl().isEmpty()) {
       String objectType = pluginConfig.get(HubSpotSourceOptions.OBJECT_TYPE);
       this.setUrl(HUBSPOT_BASE_URL + objectType);
   }
   ```
   
   ---
   
   ##  [REQUIRED] HubSpotSourceFactory.createSource() uses a non-standard 
pattern
   
   ```java
   // HubSpotSourceFactory.java L61-70
   Map<String, Object> configMap = new HashMap<>(options.toMap());
   if (!configMap.containsKey("content_field")) {
       configMap.put("content_field", 
HubSpotSourceParameter.DEFAULT_CONTENT_FIELD);  // bare String key
   }
   ReadonlyConfig modifiedConfig = ReadonlyConfig.fromMap(configMap);
   return () -> (SeaTunnelSource<T, SplitT, StateT>) new 
HubSpotSource(modifiedConfig);  // unchecked cast
   ```
   
   Issues:
   - Using a raw String `"content_field"` instead of a typed `Option` constant 
is fragile and bypasses compile-time safety.
   - Setting `content_field` is a responsibility of `HubSpotSourceParameter`, 
not the factory.
   - Every other connector's `createSource()` is simply `() -> new 
XxxSource(config)`.
   - The unchecked cast `(SeaTunnelSource<T, SplitT, StateT>)` generates a 
compiler warning.
   
   Once `HubSpotSourceParameter.buildWithConfig()` is implemented correctly 
(handling
   token injection, URL construction, and default `content_field`), the factory 
should
   be simplified to:
   ```java
   @Override
   public <T, SplitT extends SourceSplit, StateT extends Serializable>
           TableSource<T, SplitT, StateT> 
createSource(TableSourceFactoryContext context) {
       return () -> (SeaTunnelSource<T, SplitT, StateT>)
               new HubSpotSource(context.getOptions());
   }
   ```
   
   ---
   
   ##  [REQUIRED] Documentation issues
   
   **1. Wrong filename casing:** The file is named `Hubspot.md` (lowercase "s") 
but
   should be `HubSpot.md` to match the connector identifier and the naming 
convention
   of other connector docs.
   
   **2. Broken Markdown table in Key Features section (`Hubspot.md` L18):**
   ```
   | Exactly-Once | [ ] | | Parallelism | [ ] | ```
   ```
   The table collapses onto one line and a stray code fence appears. Please 
split into
   proper table rows.
   
   **3. Documentation example is misleading:** The example omits `url`, which is
   currently required. Either fix the code so `url` is truly optional 
(auto-derived
   from `object_type`), or add `url` to the example with a note.
   
   
   ---
   
   ##  [MINOR] Unit tests do not cover core behavior
   
   `HubSpotSourceFactoryTest` only verifies the `factoryIdentifier` string and 
that
   `ACCESS_TOKEN` appears in `requiredOptions`. The most important behaviors 
are untested:
   - That `access_token` is injected as `Authorization: Bearer <token>` in HTTP 
headers.
   - That `object_type` produces the correct API URL.
   - That the default `content_field` (`$.results`) is set when not explicitly 
configured.
   
   The E2E test only asserts `exitCode == 0` without verifying actual data 
content.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to