davidzollo commented on PR #10447:
URL: https://github.com/apache/seatunnel/pull/10447#issuecomment-3941037673
There are two critical functional bugs that must be fixed before this can be
merged,
plus several smaller issues.
---
## [BLOCKER] access_token is declared but never injected into HTTP headers
The `access_token` option is marked as required in
`HubSpotSourceFactory.optionRule()`,
but it is **never used anywhere in the actual HTTP request**. Every request
sent to
the real HubSpot API will return `401 Unauthorized`.
The established pattern in this project is to override `buildWithConfig()`
in a
custom `HttpParameter` subclass and inject the token as an `Authorization:
Bearer`
header. See how the existing connectors do it:
- `GithubSourceParameter.buildWithConfig()` → injects `Authorization: Bearer
<token>`
- `GitlabSourceParameter.buildWithConfig()` → injects `PRIVATE-TOKEN:
<token>`
- `KlaviyoSourceParameter.buildWithConfig()` → injects `Authorization:
Klaviyo-API-Key <key>`
**Current `HubSpotSourceParameter` (effectively empty):**
```java
public class HubSpotSourceParameter extends HttpParameter {
// Only holds a constant — buildWithConfig() is never overridden
public static final String DEFAULT_CONTENT_FIELD = "$.results";
}
```
**Current `HubSpotSource` (createReader() not overridden):**
```java
public class HubSpotSource extends HttpSource {
public HubSpotSource(ReadonlyConfig config) {
super(config); // uses base HttpParameter — no auth header
}
// createReader() falls through to HttpSource, passing the plain
httpParameter
}
```
**Required fix:**
1. Override `buildWithConfig()` in `HubSpotSourceParameter` to inject the
Bearer token:
```java
@Override
public void buildWithConfig(ReadonlyConfig pluginConfig) {
super.buildWithConfig(pluginConfig);
this.headers = this.getHeaders() == null ? new HashMap<>() :
this.getHeaders();
this.headers.put(
"Authorization",
"Bearer " + pluginConfig.get(HubSpotSourceOptions.ACCESS_TOKEN));
this.setHeaders(this.headers);
}
```
2. Override `createReader()` in `HubSpotSource` to pass the custom parameter
(see `GitlabSource` as reference):
```java
private final HubSpotSourceParameter hubSpotSourceParameter = new
HubSpotSourceParameter();
public HubSpotSource(ReadonlyConfig config) {
super(config);
hubSpotSourceParameter.buildWithConfig(config);
}
@Override
public AbstractSingleSplitReader<SeaTunnelRow> createReader(
SingleSplitReaderContext readerContext) throws Exception {
return new HttpSourceReader(
this.hubSpotSourceParameter,
readerContext,
this.deserializationSchema,
jsonField,
contentField);
}
```
The E2E test currently passes only because MockServer does not validate the
`Authorization` header. This bug would be invisible in CI but break
immediately
against the real HubSpot API.
---
## [BLOCKER] object_type is never mapped to a URL
The PR description states: *"Added dynamic configuration logic to map
object_type
to API URLs."* However, **no such mapping exists in the code**.
`object_type` is
only referenced in `HubSpotSourceOptions` and the unit test — it never
influences
the actual request URL.
The E2E test works around this by hardcoding the URL directly:
```hocon
# hubspot_source_case.conf
url = "http://mock-server:1080/crm/v3/objects/contacts" # bypasses
object_type entirely
```
Meanwhile the documentation example shows only `access_token` and
`object_type`,
with no `url` field — implying the URL is auto-generated, which it is not:
```hocon
# docs/en/connectors/source/Hubspot.md
source {
HubSpot {
access_token = "pat-na1-..."
object_type = "contacts" # user expects this to be enough, but url
is also required
}
}
```
**Required fix:** Add URL construction logic in
`HubSpotSourceParameter.buildWithConfig()`:
```java
private static final String HUBSPOT_BASE_URL =
"https://api.hubapi.com/crm/v3/objects/";
// Build URL from object_type when url is not explicitly provided
if (this.getUrl() == null || this.getUrl().isEmpty()) {
String objectType = pluginConfig.get(HubSpotSourceOptions.OBJECT_TYPE);
this.setUrl(HUBSPOT_BASE_URL + objectType);
}
```
---
## [REQUIRED] HubSpotSourceFactory.createSource() uses a non-standard
pattern
```java
// HubSpotSourceFactory.java L61-70
Map<String, Object> configMap = new HashMap<>(options.toMap());
if (!configMap.containsKey("content_field")) {
configMap.put("content_field",
HubSpotSourceParameter.DEFAULT_CONTENT_FIELD); // bare String key
}
ReadonlyConfig modifiedConfig = ReadonlyConfig.fromMap(configMap);
return () -> (SeaTunnelSource<T, SplitT, StateT>) new
HubSpotSource(modifiedConfig); // unchecked cast
```
Issues:
- Using a raw String `"content_field"` instead of a typed `Option` constant
is fragile and bypasses compile-time safety.
- Setting `content_field` is a responsibility of `HubSpotSourceParameter`,
not the factory.
- Every other connector's `createSource()` is simply `() -> new
XxxSource(config)`.
- The unchecked cast `(SeaTunnelSource<T, SplitT, StateT>)` generates a
compiler warning.
Once `HubSpotSourceParameter.buildWithConfig()` is implemented correctly
(handling
token injection, URL construction, and default `content_field`), the factory
should
be simplified to:
```java
@Override
public <T, SplitT extends SourceSplit, StateT extends Serializable>
TableSource<T, SplitT, StateT>
createSource(TableSourceFactoryContext context) {
return () -> (SeaTunnelSource<T, SplitT, StateT>)
new HubSpotSource(context.getOptions());
}
```
---
## [REQUIRED] Documentation issues
**1. Wrong filename casing:** The file is named `Hubspot.md` (lowercase "s")
but
should be `HubSpot.md` to match the connector identifier and the naming
convention
of other connector docs.
**2. Broken Markdown table in Key Features section (`Hubspot.md` L18):**
```
| Exactly-Once | [ ] | | Parallelism | [ ] | ```
```
The table collapses onto one line and a stray code fence appears. Please
split into
proper table rows.
**3. Documentation example is misleading:** The example omits `url`, which is
currently required. Either fix the code so `url` is truly optional
(auto-derived
from `object_type`), or add `url` to the example with a note.
---
## [MINOR] Unit tests do not cover core behavior
`HubSpotSourceFactoryTest` only verifies the `factoryIdentifier` string and
that
`ACCESS_TOKEN` appears in `requiredOptions`. The most important behaviors
are untested:
- That `access_token` is injected as `Authorization: Bearer <token>` in HTTP
headers.
- That `object_type` produces the correct API URL.
- That the default `content_field` (`$.results`) is set when not explicitly
configured.
The E2E test only asserts `exitCode == 0` without verifying actual data
content.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]