HTTPConnector

We assume that you have already learned what is described in:

If you want to find the right Reader for your purposes, see Readers Comparison.

Short Summary

HTTPConnector sends HTTP requests to a web server and receives responses

Component Same input metadata Sorted inputs Inputs Outputs Each to all outputs1) Java CTL
HTTPConnector-no0-10-1-nono

1) Component sends each data record to all connected output ports.

Abstract

HTTPConnector sends HTTP requests to a web server and receives responses. Request is written in a file or in the graph itself or it is received through a single input port, . If request is defined in a file or in the graph, response is written to a response file (single HTTP interaction). If request is received through a port, response is also sent out through a single output port (multi HTTP interaction) or it can also be written to temporary files and information about these files is sent to the specified output field.

HTTPConnector copies all metadata fields from the input edge to the output. This is why the output metadata must be a superset of the input metadata.

HTTPConnector allows for advanced paging control via the specific CTL functions generateRequestParameters, checkResponse, and modifyRequestParamsBeforeRetryAttempt. These functions are pre-generated and described in the editor of the Request handling functions attribute.

Icon

Ports

Port typeNumberRequiredDescriptionMetadata
Input01)For URL, parameters for Query, or Request bodyAny12)
Output01)For response or for data records with URL of files containing responseAny23)

Legend:

1): Either both or neither of them must be connected.

2): If connected, Input field need not be specified only if the first field (with URL from field) is of string data type.

3): If connected, Output field need not be specified only if the first field is of string data type.

HTTPConnector Attributes

AttributeReqDescriptionPossible values
Basic
Authentication method
no
Authentication method.Currently only the HTTP BASIC authentication scheme is supported.
Username
no
Authentication username. 
Password
no
Authentication password. 
Request URL
yes
The request URL that can contain parameters (e.g. ${page_id}). The parameter values can be defined within the Request handling functions. 
Request method
yes
Method of request. GET (default) | POST
Request Headers
no
Request headers. A dialog is used to create it, the final form is a sequence of key=value pairs separated by comma and the whole sequence is surrounded by curly braces. 
Request Body
no
HTTP POST request body. 
Request handling functions
no
Functions that control paging, retry logic etc. See the code editor for more functions documentation and examples. 
Request handling functions URL
no
Functions that control paging, retry logic etc defined in a separate file. 
Charset
yes
Character encoding of the input/output filesISO-8859-1 (default) | other encoding
Advanced
Delay between requests [secs]
yes
This value specifies the delay between individual requests.Default value is 0
Max. retry attempts
yes
Maximum number of retries that will be attempted if the previous attempts failed. Default value is 5.
Pause between retries [secs]
yes
This value specifies the delay between individual retries in seconds.Default is -1 - exponential durations 1,2,4,8 etc.
Max. pages limit (per input record)
yes
Maximum number of pages retrieved during the paging mechanism. Default value is 10,000.