Package crawlercommons.urlfrontier
Class URLFrontierGrpc.URLFrontierBlockingV2Stub
java.lang.Object
io.grpc.stub.AbstractStub<URLFrontierGrpc.URLFrontierBlockingV2Stub>
io.grpc.stub.AbstractBlockingStub<URLFrontierGrpc.URLFrontierBlockingV2Stub>
crawlercommons.urlfrontier.URLFrontierGrpc.URLFrontierBlockingV2Stub
- Enclosing class:
URLFrontierGrpc
public static final class URLFrontierGrpc.URLFrontierBlockingV2Stub
extends io.grpc.stub.AbstractBlockingStub<URLFrontierGrpc.URLFrontierBlockingV2Stub>
A stub to allow clients to do synchronous rpc calls to service URLFrontier.
-
Nested Class Summary
Nested classes/interfaces inherited from class io.grpc.stub.AbstractStub
io.grpc.stub.AbstractStub.StubFactory<T extends io.grpc.stub.AbstractStub<T>> -
Method Summary
Modifier and TypeMethodDescriptionBlock a queue from sending URLs; the argument is the number of seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z.build(io.grpc.Channel channel, io.grpc.CallOptions callOptions) countURLs(Urlfrontier.CountUrlParams request) Count URLs currently in the frontier *Delete an entire crawl, returns the number of URLs removed this way *Delete the queue based on the key in parameter, returns the number of URLs removed this way *getActive(Urlfrontier.Local request) Returns true if the crawl is active, false if it has been deactivated with SetActive(Boolean) *Return stats for a specific queue or an entire crawl.io.grpc.stub.BlockingClientCall<?, Urlfrontier.URLInfo> getURLs(Urlfrontier.GetParams request) Stream URLs due for fetching from M queues with up to N items per queue *getURLStatus(Urlfrontier.URLStatusRequest request) Get status of a particular URL This does not take into account URL scheduling.listCrawls(Urlfrontier.Local request) Return the list of crawls handled by the frontier(s) *listNodes(Urlfrontier.Empty request) Return the list of nodes forming the cluster the current node belongs to *listQueues(Urlfrontier.Pagination request) Return a list of queues for a specific crawl.io.grpc.stub.BlockingClientCall<?, Urlfrontier.URLItem> listURLs(Urlfrontier.ListUrlParams request) List all URLs currently in the frontier This does not take into account URL scheduling.io.grpc.stub.BlockingClientCall<Urlfrontier.URLItem, Urlfrontier.AckMessage> putURLs()Push URL items to the server; they get created (if they don't already exist) in case of DiscoveredURLItems or updated if KnownURLItems *setActive(Urlfrontier.Active request) De/activate the crawl.Sets crawl limit for domain *setDelay(Urlfrontier.QueueDelayParams request) Set a delay from a given queue.setLogLevel(Urlfrontier.LogLevelParams request) Overrides the log level for a given package *Methods inherited from class io.grpc.stub.AbstractBlockingStub
newStub, newStubMethods inherited from class io.grpc.stub.AbstractStub
getCallOptions, getChannel, withCallCredentials, withChannel, withCompression, withDeadline, withDeadlineAfter, withDeadlineAfter, withExecutor, withInterceptors, withMaxInboundMessageSize, withMaxOutboundMessageSize, withOnReadyThreshold, withOption, withWaitForReady
-
Method Details
-
build
protected URLFrontierGrpc.URLFrontierBlockingV2Stub build(io.grpc.Channel channel, io.grpc.CallOptions callOptions) - Specified by:
buildin classio.grpc.stub.AbstractStub<URLFrontierGrpc.URLFrontierBlockingV2Stub>
-
listNodes
Return the list of nodes forming the cluster the current node belongs to *
- Throws:
io.grpc.StatusException
-
listCrawls
Return the list of crawls handled by the frontier(s) *
- Throws:
io.grpc.StatusException
-
deleteCrawl
public Urlfrontier.Long deleteCrawl(Urlfrontier.DeleteCrawlMessage request) throws io.grpc.StatusException Delete an entire crawl, returns the number of URLs removed this way *
- Throws:
io.grpc.StatusException
-
listQueues
public Urlfrontier.QueueList listQueues(Urlfrontier.Pagination request) throws io.grpc.StatusException Return a list of queues for a specific crawl. Can chose whether to include inactive queues (a queue is active if it has URLs due for fetching); by default the service will return up to 100 results from offset 0 and exclude inactive queues.*
- Throws:
io.grpc.StatusException
-
getURLs
@ExperimentalApi("https://github.com/grpc/grpc-java/issues/10918") public io.grpc.stub.BlockingClientCall<?,Urlfrontier.URLInfo> getURLs(Urlfrontier.GetParams request) Stream URLs due for fetching from M queues with up to N items per queue *
-
putURLs
@ExperimentalApi("https://github.com/grpc/grpc-java/issues/10918") public io.grpc.stub.BlockingClientCall<Urlfrontier.URLItem,Urlfrontier.AckMessage> putURLs()Push URL items to the server; they get created (if they don't already exist) in case of DiscoveredURLItems or updated if KnownURLItems *
-
getStats
public Urlfrontier.Stats getStats(Urlfrontier.QueueWithinCrawlParams request) throws io.grpc.StatusException Return stats for a specific queue or an entire crawl. Does not aggregate the stats across different crawlids. *
- Throws:
io.grpc.StatusException
-
deleteQueue
public Urlfrontier.Long deleteQueue(Urlfrontier.QueueWithinCrawlParams request) throws io.grpc.StatusException Delete the queue based on the key in parameter, returns the number of URLs removed this way *
- Throws:
io.grpc.StatusException
-
blockQueueUntil
public Urlfrontier.Empty blockQueueUntil(Urlfrontier.BlockQueueParams request) throws io.grpc.StatusException Block a queue from sending URLs; the argument is the number of seconds of UTC time since Unix epoch 1970-01-01T00:00:00Z. The default value of 0 will unblock the queue. The block will get removed once the time indicated in argument is reached. This is useful for cases where a server returns a Retry-After for instance.
- Throws:
io.grpc.StatusException
-
setActive
De/activate the crawl. GetURLs will not return anything until SetActive is set to true. PutURLs will still take incoming data. *
- Throws:
io.grpc.StatusException
-
getActive
Returns true if the crawl is active, false if it has been deactivated with SetActive(Boolean) *
- Throws:
io.grpc.StatusException
-
setDelay
public Urlfrontier.Empty setDelay(Urlfrontier.QueueDelayParams request) throws io.grpc.StatusException Set a delay from a given queue. No URLs will be obtained via GetURLs for this queue until the number of seconds specified has elapsed since the last time URLs were retrieved. Usually informed by the delay setting of robots.txt.
- Throws:
io.grpc.StatusException
-
setLogLevel
public Urlfrontier.Empty setLogLevel(Urlfrontier.LogLevelParams request) throws io.grpc.StatusException Overrides the log level for a given package *
- Throws:
io.grpc.StatusException
-
setCrawlLimit
public Urlfrontier.Empty setCrawlLimit(Urlfrontier.CrawlLimitParams request) throws io.grpc.StatusException Sets crawl limit for domain *
- Throws:
io.grpc.StatusException
-
getURLStatus
public Urlfrontier.URLItem getURLStatus(Urlfrontier.URLStatusRequest request) throws io.grpc.StatusException Get status of a particular URL This does not take into account URL scheduling. Used to check current status of an URL within the frontier
- Throws:
io.grpc.StatusException
-
listURLs
@ExperimentalApi("https://github.com/grpc/grpc-java/issues/10918") public io.grpc.stub.BlockingClientCall<?,Urlfrontier.URLItem> listURLs(Urlfrontier.ListUrlParams request) List all URLs currently in the frontier This does not take into account URL scheduling. Used to check current status of all URLs within the frontier
-
countURLs
public Urlfrontier.Long countURLs(Urlfrontier.CountUrlParams request) throws io.grpc.StatusException Count URLs currently in the frontier *
- Throws:
io.grpc.StatusException
-