BigQuery Storage gRPC-Web: Got 404 error while creating a read session

New updates at the end.

We are creating a BigQuery Storage library in JavaScript for browsers.

The problem we are facing is that we obtain a 404 error from https://bigquerystorage.googleapis.com/:

404 camouflaged as CORS

It looks like a CORS problem, but it really is a 404 one:

$> curl 'https://bigquerystorage.googleapis.com/google.cloud.bigquery.storage.v1beta1.BigQueryStorage/CreateReadSession' \
       -H 'x-goog-request-params: table_reference.project_id=project&table_reference.dataset_id=dataset' \
       -H 'X-User-Agent: grpc-web-javascript/0.1' -H 'DNT: 1' -H 'Content-Type: application/grpc-web-text' \
       -H 'Accept: application/grpc-web-text' -H 'X-Grpc-Web: 1' -H 'Sec-Fetch-Dest: empty' \
       --data-binary 'ydG9k_binary_data_oATgB' --compressed

<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 404 (Not Found)!!1</title>
...

Any idea what is happening with the request?

Here are all the resources we are using. This is the base structure:

project structure

The original .proto files come from:

For compiling the .proto into the .js ones we use protoc and the protoc-gen-grpc-web plugin with:

cd src/proto/
protoc -I=. $(find . -iname "*.proto") --js_out=import_style=commonjs:. --grpc-web_out=import_style=commonjs,mode=grpcwebtext:.

And finally, this is the big_query_storage.js file that we bundle with rollup.js:

import BigQueryStorage from './proto/storage_grpc_web_pb.js';
import BigQueryStorageEnums from './proto/storage_pb.js';
import TableComponents from './proto/google/cloud/bigquery/storage/v1beta1/table_reference_pb.js';
import Timestamp from 'google-protobuf/google/protobuf/timestamp_pb.js';

const bigQueryStorageHostname = 'https://bigquerystorage.googleapis.com'

const projectId = 'project';
const datasetId = 'dataset';
const tableId = 'table';

const fields = ['column'];
const rowRestriction = 'where';

function BigQueryStorageTest() {
  let client = new BigQueryStorage.BigQueryStorageClient(bigQueryStorageHostname);

  let tableReference = new TableComponents.TableReference();
  tableReference.setProjectId(projectId);
  tableReference.setDatasetId(datasetId);
  tableReference.setTableId(tableId);

  let readOptions = new TableComponents.TableReadOptions();
  readOptions.setSelectedFieldsList(fields);
  readOptions.setRowRestriction(rowRestriction);

  const parent = `projects/${projectId}`;

  let readSessionRequest = new BigQueryStorage.CreateReadSessionRequest();
  readSessionRequest.setTableReference(tableReference);
  readSessionRequest.setParent(parent);
  readSessionRequest.setReadOptions(readOptions);
  readSessionRequest.setFormat(BigQueryStorageEnums.DataFormat.AVRO);
  readSessionRequest.setShardingStrategy(BigQueryStorageEnums.ShardingStrategy.LIQUID);

  let metadata = {};

  let routingHeader = new URLSearchParams({
    'table_reference.project_id': encodeURI(tableReference.getProjectId()),
    'table_reference.dataset_id': encodeURI(tableReference.getDatasetId())
  }).toString();

  let routingMetadata = {
    'x-goog-request-params': routingHeader
  };

  metadata = {...metadata, ...routingMetadata}

  let session = client.createReadSession(readSessionRequest, metadata);
}

export default BigQueryStorageTest;

UPDATE 1:

It looks the Big Query Storage API doesn't support gRPC calls in text mode. Text mode set the Content-Type header to application/grpc-web-text and the body as binary encoded in base64.

So, because of that, the API can't find the table in the encoded body, returning a 404 error.

Modifying the compiled .js files (from the .proto) for using binary as format, the body is sent as binary and the Content-Type header is set to application/x-protobuf. This works as the API doesn't return a 404 error, but a 400 is received, Invalid resource field value in the request.

UPDATE 2:

There is a new BigQuery Storage library for Node.js https://github.com/googleapis/nodejs-bigquery-storage, which works like a charm in Node.js, obviously, but it has some problem when ported to browsers.

It looks that it can't perform stream operations, and gives this error when accessing the generated stub for the readRows service:

TypeError: undefined is not a function
    at Service.newServiceStub.<computed> [as readRows] (fallback.js:190)
    at big_query_storage_client.js:147
    at streamingApiCaller.js:37
    at timeout.js:43
    at Object.request (streaming.js:102)
    at makeRequest (index.js:128)
    at retryRequest (index.js:96)
    at StreamProxy.setStream (streaming.js:93)
    at StreamingApiCaller.call (streamingApiCaller.js:53)
    at createApiCall.js:72

Answers:

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us

©2020 All rights reserved.