Dlib is principally a C++ library, however, you can use a number of its tools from python applications. This page documents the python API for working with these dlib tools. If you haven’t done so already, you should probably look at the python example programs first before consulting this reference. These example programs are little mini-tutorials for using dlib from python. They are listed on the left of the main dlib web page.
Classes¶
Functions¶
Constants¶
dlib.DLIB_USE_BLAS
dlib.DLIB_USE_CUDA
dlib.DLIB_USE_LAPACK
dlib.KBD_MOD_ALT
dlib.KBD_MOD_CAPS_LOCK
dlib.KBD_MOD_CONTROL
dlib.KBD_MOD_META
dlib.KBD_MOD_NONE
dlib.KBD_MOD_NUM_LOCK
dlib.KBD_MOD_SCROLL_LOCK
dlib.KBD_MOD_SHIFT
dlib.KEY_ALT
dlib.KEY_BACKSPACE
dlib.KEY_CAPS_LOCK
dlib.KEY_CTRL
dlib.KEY_DELETE
dlib.KEY_DOWN
dlib.KEY_END
dlib.KEY_ESC
dlib.KEY_F1
dlib.KEY_F10
dlib.KEY_F11
dlib.KEY_F12
dlib.KEY_F2
dlib.KEY_F3
dlib.KEY_F4
dlib.KEY_F5
dlib.KEY_F6
dlib.KEY_F7
dlib.KEY_F8
dlib.KEY_F9
dlib.KEY_HOME
dlib.KEY_INSERT
dlib.KEY_LEFT
dlib.KEY_PAGE_DOWN
dlib.KEY_PAGE_UP
dlib.KEY_PAUSE
dlib.KEY_RIGHT
dlib.KEY_SCROLL_LOCK
dlib.KEY_SHIFT
dlib.KEY_UP
dlib.USE_AVX_INSTRUCTIONS
dlib.USE_NEON_INSTRUCTIONS
Detailed API Listing¶
- dlib.angle_between_lines(a: dlib.line, b: dlib.line) float ¶
- ensures
returns the angle, in degrees, between the given lines. This is a number in the range [0 90].
- dlib.apply_cca_transform(m: dlib.matrix, v: dlib.sparse_vector) dlib.vector ¶
- requires
max_index_plus_one(v) <= m.nr()
- ensures
returns trans(m)*v (i.e. multiply m by the vector v and return the result)
- class dlib.array¶
This object represents a 1D array of floating point numbers. Moreover, it binds directly to the C++ type std::vector<double>.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.array) -> None
__init__(self: dlib.array, arg0: dlib.array) -> None
Copy constructor
__init__(self: dlib.array, arg0: iterable) -> None
__init__(self: dlib.array, arg0: object) -> None
- append(self: dlib.array, x: float) None ¶
Add an item to the end of the list
- clear(self: dlib.array) None ¶
- count(self: dlib.array, x: float) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.array, L: dlib.array) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.array, arg0: list) -> None
- insert(self: dlib.array, i: int, x: float) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.array) -> float
Remove and return the last item
pop(self: dlib.array, i: int) -> float
Remove and return the item at index
i
- remove(self: dlib.array, x: float) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.array, arg0: int) None ¶
- dlib.as_grayscale(img: array) array ¶
Convert an image to 8bit grayscale. If it’s already a grayscale image do nothing and just return img.
- dlib.assignment_cost(cost: dlib.matrix, assignment: list) float ¶
- requires
cost.nr() == cost.nc() (i.e. the input must be a square matrix)
- for all valid i:
0 <= assignment[i] < cost.nr()
- ensures
Interprets cost as a cost assignment matrix. That is, cost[i][j] represents the cost of assigning i to j.
Interprets assignment as a particular set of assignments. That is, i is assigned to assignment[i].
returns the cost of the given assignment. That is, returns a number which is:
sum over i: cost[i][assignment[i]]
- dlib.auto_train_rbf_classifier(*args, **kwargs)¶
Overloaded function.
auto_train_rbf_classifier(x: dlib.vectors, y: dlib.array, max_runtime_seconds: float, be_verbose: bool=True) -> dlib._normalized_decision_function_radial_basis
- requires
y contains at least 6 examples of each class. Moreover, every element in y is either +1 or -1.
max_runtime_seconds >= 0
len(x) == len(y)
all the vectors in x have the same dimension.
- ensures
This routine trains a radial basis function SVM on the given binary classification training data. It uses the svm_c_trainer to do this. It also uses find_max_global() and 6-fold cross-validation to automatically determine the best settings of the SVM’s hyper parameters.
Note that we interpret y[i] as the label for the vector x[i]. Therefore, the returned function, df, should generally satisfy sign(df(x[i])) == y[i] as often as possible.
The hyperparameter search will run for about max_runtime and will print messages to the screen as it runs if be_verbose==true.
auto_train_rbf_classifier(x: numpy.ndarray[(rows,cols),float64], y: numpy.ndarray[float64], max_runtime_seconds: float, be_verbose: bool=True) -> dlib._normalized_decision_function_radial_basis
- requires
y contains at least 6 examples of each class. Moreover, every element in y is either +1 or -1.
max_runtime_seconds >= 0
len(x.shape(0)) == len(y)
x.shape(1) > 0
- ensures
This routine trains a radial basis function SVM on the given binary classification training data. It uses the svm_c_trainer to do this. It also uses find_max_global() and 6-fold cross-validation to automatically determine the best settings of the SVM’s hyper parameters.
Note that we interpret y[i] as the label for the vector x[i]. Therefore, the returned function, df, should generally satisfy sign(df(x[i])) == y[i] as often as possible.
The hyperparameter search will run for about max_runtime and will print messages to the screen as it runs if be_verbose==true.
- dlib.cca(L: dlib.sparse_vectors, R: dlib.sparse_vectors, num_correlations: int, extra_rank: int = 5, q: int = 2, regularization: float = 0) dlib.cca_outputs ¶
- requires
num_correlations > 0
len(L) > 0
len(R) > 0
len(L) == len(R)
regularization >= 0
L and R must be properly sorted sparse vectors. This means they must list their elements in ascending index order and not contain duplicate index values. You can use make_sparse_vector() to ensure this is true.
- ensures
This function performs a canonical correlation analysis between the vectors in L and R. That is, it finds two transformation matrices, Ltrans and Rtrans, such that row vectors in the transformed matrices L*Ltrans and R*Rtrans are as correlated as possible (note that in this notation we interpret L as a matrix with the input vectors in its rows). Note also that this function tries to find transformations which produce num_correlations dimensional output vectors.
Note that you can easily apply the transformation to a vector using apply_cca_transform(). So for example, like this:
apply_cca_transform(Ltrans, some_sparse_vector)
returns a structure containing the Ltrans and Rtrans transformation matrices as well as the estimated correlations between elements of the transformed vectors.
This function assumes the data vectors in L and R have already been centered (i.e. we assume the vectors have zero means). However, in many cases it is fine to use uncentered data with cca(). But if it is important for your problem then you should center your data before passing it to cca().
This function works with reduced rank approximations of the L and R matrices. This makes it fast when working with large matrices. In particular, we use the dlib::svd_fast() routine to find reduced rank representations of the input matrices by calling it as follows: svd_fast(L, U,D,V, num_correlations+extra_rank, q) and similarly for R. This means that you can use the extra_rank and q arguments to cca() to influence the accuracy of the reduced rank approximation. However, the default values should work fine for most problems.
The dimensions of the output vectors produced by L*#Ltrans or R*#Rtrans are ordered such that the dimensions with the highest correlations come first. That is, after applying the transforms produced by cca() to a set of vectors you will find that dimension 0 has the highest correlation, then dimension 1 has the next highest, and so on. This also means that the list of estimated correlations returned from cca() will always be listed in decreasing order.
This function performs the ridge regression version of Canonical Correlation Analysis when regularization is set to a value > 0. In particular, larger values indicate the solution should be more heavily regularized. This can be useful when the dimensionality of the data is larger than the number of samples.
A good discussion of CCA can be found in the paper “Canonical Correlation Analysis” by David Weenink. In particular, this function is implemented using equations 29 and 30 from his paper. We also use the idea of doing CCA on a reduced rank approximation of L and R as suggested by Paramveer S. Dhillon in his paper “Two Step CCA: A new spectral method for estimating vector models of words”.
- class dlib.cca_outputs¶
- property Ltrans¶
- property Rtrans¶
- __init__(*args, **kwargs)¶
- property correlations¶
- dlib.center(*args, **kwargs)¶
Overloaded function.
center(rect: dlib.rectangle) -> dlib.point
returns the center of the given rectangle
center(rect: dlib.drectangle) -> dlib.dpoint
returns the center of the given rectangle
- dlib.centered_rect(*args, **kwargs)¶
Overloaded function.
centered_rect(p: dlib.point, width: int, height: int) -> dlib.rectangle
centered_rect(p: dlib.dpoint, width: int, height: int) -> dlib.rectangle
centered_rect(rect: dlib.rectangle, width: int, height: int) -> dlib.rectangle
centered_rect(rect: dlib.drectangle, width: int, height: int) -> dlib.rectangle
- dlib.centered_rects(pts: dlib.points, width: int, height: int) dlib.rectangles ¶
- dlib.chinese_whispers(edges: list) list ¶
Given a graph with vertices represented as numbers indexed from 0, this algorithm takes a list of edges and returns back a list that contains a labels (found clusters) for each vertex. Edges are tuples with either 2 elements (integers presenting indexes of connected vertices) or 3 elements, where additional one element is float which presents distance weight of the edge). Offers direct access to dlib::chinese_whispers.
- dlib.chinese_whispers_clustering(descriptors: list, threshold: float) list ¶
Takes a list of descriptors and returns a list that contains a label for each descriptor. Clustering is done using dlib::chinese_whispers.
- class dlib.chip_details¶
WHAT THIS OBJECT REPRESENTS This object describes where an image chip is to be extracted from within another image. In particular, it specifies that the image chip is contained within the rectangle self.rect and that prior to extraction the image should be rotated counter-clockwise by self.angle radians. Finally, the extracted chip should have self.rows rows and self.cols columns in it regardless of the shape of self.rect. This means that the extracted chip will be stretched to fit via bilinear interpolation when necessary.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.chip_details, rect: dlib.drectangle) -> None
__init__(self: dlib.chip_details, rect: dlib.rectangle) -> None
- ensures
self.rect == rect_
self.angle == 0
self.rows == rect.height()
self.cols == rect.width()
__init__(self: dlib.chip_details, rect: dlib.drectangle, size: int) -> None
__init__(self: dlib.chip_details, rect: dlib.rectangle, size: int) -> None
- ensures
self.rect == rect
self.angle == 0
self.rows and self.cols is set such that the total size of the chip is as close to size as possible but still matches the aspect ratio of rect.
As long as size and the aspect ratio of rect stays constant then self.rows and self.cols will always have the same values. This means that, for example, if you want all your chips to have the same dimensions then ensure that size is always the same and also that rect always has the same aspect ratio. Otherwise the calculated values of self.rows and self.cols may be different for different chips. Alternatively, you can use the chip_details constructor below that lets you specify the exact values for rows and cols.
__init__(self: dlib.chip_details, rect: dlib.drectangle, size: int, angle: float) -> None
__init__(self: dlib.chip_details, rect: dlib.rectangle, size: int, angle: float) -> None
- ensures
self.rect == rect
self.angle == angle
self.rows and self.cols is set such that the total size of the chip is as close to size as possible but still matches the aspect ratio of rect.
As long as size and the aspect ratio of rect stays constant then self.rows and self.cols will always have the same values. This means that, for example, if you want all your chips to have the same dimensions then ensure that size is always the same and also that rect always has the same aspect ratio. Otherwise the calculated values of self.rows and self.cols may be different for different chips. Alternatively, you can use the chip_details constructor below that lets you specify the exact values for rows and cols.
__init__(self: dlib.chip_details, rect: dlib.drectangle, dims: dlib.chip_dims) -> None
__init__(self: dlib.chip_details, rect: dlib.rectangle, dims: dlib.chip_dims) -> None
- ensures
self.rect == rect
self.angle == 0
self.rows == dims.rows
self.cols == dims.cols
__init__(self: dlib.chip_details, rect: dlib.drectangle, dims: dlib.chip_dims, angle: float) -> None
__init__(self: dlib.chip_details, rect: dlib.rectangle, dims: dlib.chip_dims, angle: float) -> None
- ensures
self.rect == rect
self.angle == angle
self.rows == dims.rows
self.cols == dims.cols
__init__(self: dlib.chip_details, chip_points: dlib.dpoints, img_points: dlib.dpoints, dims: dlib.chip_dims) -> None
__init__(self: dlib.chip_details, chip_points: dlib.points, img_points: dlib.points, dims: dlib.chip_dims) -> None
- requires
len(chip_points) == len(img_points)
len(chip_points) >= 2
- ensures
The chip will be extracted such that the pixel locations chip_points[i] in the chip are mapped to img_points[i] in the original image by a similarity transform. That is, if you know the pixelwize mapping you want between the chip and the original image then you use this function of chip_details constructor to define the mapping.
self.rows == dims.rows
self.cols == dims.cols
self.rect and self.angle are computed based on the given size of the output chip (specified by dims) and the similarity transform between the chip and image (specified by chip_points and img_points).
- property angle¶
- property cols¶
- property rect¶
- property rows¶
- class dlib.chip_detailss¶
An array of chip_details objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.chip_detailss) -> None
__init__(self: dlib.chip_detailss, arg0: dlib.chip_detailss) -> None
Copy constructor
__init__(self: dlib.chip_detailss, arg0: iterable) -> None
- append(self: dlib.chip_detailss, x: dlib.chip_details) None ¶
Add an item to the end of the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.chip_detailss, L: dlib.chip_detailss) -> None
Extend the list by appending all the items in the given list
extend(self: std::vector<std::vector<dlib::chip_details, std::allocator<dlib::chip_details> >, std::allocator<std::vector<dlib::chip_details, std::allocator<dlib::chip_details> > > >, arg0: list) -> None
- insert(self: dlib.chip_detailss, i: int, x: dlib.chip_details) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.chip_detailss) -> dlib.chip_details
Remove and return the last item
pop(self: dlib.chip_detailss, i: int) -> dlib.chip_details
Remove and return the item at index
i
- class dlib.chip_dims¶
WHAT THIS OBJECT REPRESENTS This is a simple tool for passing in a pair of row and column values to the chip_details constructor.
- __init__(self: dlib.chip_dims, rows: int, cols: int) None ¶
- property cols¶
- property rows¶
- class dlib.cnn_face_detection_model_v1¶
This object detects human faces in an image. The constructor loads the face detection model from a file. You can download a pre-trained model from http://dlib.net/files/mmod_human_face_detector.dat.bz2.
- __call__(*args, **kwargs)¶
Overloaded function.
__call__(self: dlib.cnn_face_detection_model_v1, imgs: list, upsample_num_times: int=0, batch_size: int=128) -> std::vector<std::vector<dlib::mmod_rect, std::allocator<dlib::mmod_rect> >, std::allocator<std::vector<dlib::mmod_rect, std::allocator<dlib::mmod_rect> > > >
takes a list of images as input returning a 2d list of mmod rectangles
__call__(self: dlib.cnn_face_detection_model_v1, img: array, upsample_num_times: int=0) -> std::vector<dlib::mmod_rect, std::allocator<dlib::mmod_rect> >
- Find faces in an image using a deep learning model.
Upsamples the image upsample_num_times before running the face detector.
- __init__(self: dlib.cnn_face_detection_model_v1, filename: str) None ¶
- dlib.convert_image(*args, **kwargs)¶
Overloaded function.
convert_image(img: numpy.ndarray[(rows,cols),uint8], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols),uint16], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols),uint32], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols),uint64], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols),int8], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols),int16], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols),int32], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols),int64], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols),float32], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols),float64], dtype: str) -> array
convert_image(img: numpy.ndarray[(rows,cols,3),uint8], dtype: str) -> array
- Converts an image to a target pixel type. dtype must be a string containing one of the following:
uint8, int8, uint16, int16, uint32, int32, uint64, int64, float32, float, float64, double, or rgb_pixel
When converting from a color space with more than 255 values the pixel intensity is saturated at the minimum and maximum pixel values of the target pixel type. For example, if you convert a float valued image to uint8 then float values will be truncated to integers and values larger than 255 are converted to 255 while values less than 0 are converted to 0.
- dlib.convert_image_scaled(*args, **kwargs)¶
Overloaded function.
convert_image_scaled(img: numpy.ndarray[(rows,cols),uint8], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols),uint16], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols),uint32], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols),uint64], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols),int8], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols),int16], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols),int32], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols),int64], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols),float32], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols),float64], dtype: str, thresh: float=4) -> array
convert_image_scaled(img: numpy.ndarray[(rows,cols,3),uint8], dtype: str, thresh: float=4) -> array
- requires
thresh > 0
- ensures
Converts an image to a target pixel type. dtype must be a string containing one of the following: uint8, int8, uint16, int16, uint32, int32, uint64, int64, float32, float, float64, double, or rgb_pixel
The contents of img will be scaled to fit the dynamic range of the target pixel type. The thresh parameter is used to filter source pixel values which are outliers. These outliers will saturate at the edge of the destination image’s dynamic range.
- Specifically, for all valid r and c:
We scale img[r][c] into the dynamic range of the target pixel type. This is done using the mean and standard deviation of img. Call the mean M and the standard deviation D. Then the scaling from source to destination is performed using the following mapping:
let SRC_UPPER = min(M + thresh*D, max(img)) let SRC_LOWER = max(M - thresh*D, min(img)) let DEST_UPPER = max value possible for the selected dtype. let DEST_LOWER = min value possible for the selected dtype.
MAPPING: [SRC_LOWER, SRC_UPPER] -> [DEST_LOWER, DEST_UPPER]
Where this mapping is a linear mapping of values from the left range into the right range of values. Source pixel values outside the left range are modified to be at the appropriate end of the range.
- class dlib.correlation_tracker¶
This is a tool for tracking moving objects in a video stream. You give it the bounding box of an object in the first frame and it attempts to track the object in the box from frame to frame. This tool is an implementation of the method described in the following paper:
Danelljan, Martin, et al. ‘Accurate scale estimation for robust visual tracking.’ Proceedings of the British Machine Vision Conference BMVC. 2014.
- __init__(self: dlib.correlation_tracker) None ¶
- get_position(self: dlib.correlation_tracker) dlib.drectangle ¶
returns the predicted position of the object under track.
- start_track(*args, **kwargs)¶
Overloaded function.
start_track(self: dlib.correlation_tracker, image: array, bounding_box: dlib.drectangle) -> None
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
bounding_box.is_empty() == false
- ensures
This object will start tracking the thing inside the bounding box in the given image. That is, if you call update() with subsequent video frames then it will try to keep track of the position of the object inside bounding_box.
#get_position() == bounding_box
start_track(self: dlib.correlation_tracker, image: array, bounding_box: dlib.rectangle) -> None
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
bounding_box.is_empty() == false
- ensures
This object will start tracking the thing inside the bounding box in the given image. That is, if you call update() with subsequent video frames then it will try to keep track of the position of the object inside bounding_box.
#get_position() == bounding_box
- update(*args, **kwargs)¶
Overloaded function.
update(self: dlib.correlation_tracker, image: array) -> float
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
get_position().is_empty() == false (i.e. you must have started tracking by calling start_track())
- ensures
performs: return update(img, get_position())
update(self: dlib.correlation_tracker, image: array, guess: dlib.drectangle) -> float
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
get_position().is_empty() == false (i.e. you must have started tracking by calling start_track())
- ensures
When searching for the object in img, we search in the area around the provided guess.
#get_position() == the new predicted location of the object in img. This location will be a copy of guess that has been translated and scaled appropriately based on the content of img so that it, hopefully, bounds the object in img.
Returns the peak to side-lobe ratio. This is a number that measures how confident the tracker is that the object is inside #get_position(). Larger values indicate higher confidence.
update(self: dlib.correlation_tracker, image: array, guess: dlib.rectangle) -> float
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
get_position().is_empty() == false (i.e. you must have started tracking by calling start_track())
- ensures
When searching for the object in img, we search in the area around the provided guess.
#get_position() == the new predicted location of the object in img. This location will be a copy of guess that has been translated and scaled appropriately based on the content of img so that it, hopefully, bounds the object in img.
Returns the peak to side-lobe ratio. This is a number that measures how confident the tracker is that the object is inside #get_position(). Larger values indicate higher confidence.
- dlib.count_points_between_lines(*args, **kwargs)¶
Overloaded function.
count_points_between_lines(l1: dlib.line, l2: dlib.line, reference_point: dlib.dpoint, pts: dlib.points) -> float
count_points_between_lines(l1: dlib.line, l2: dlib.line, reference_point: dlib.dpoint, pts: dlib.dpoints) -> float
- ensures
Counts and returns the number of points in pts that are between lines l1 and l2. Since a pair of lines will, in the general case, divide the plane into 4 regions, we identify the region of interest as the one that contains the reference_point. Therefore, this function counts the number of points in pts that appear in the same region as reference_point.
- dlib.count_points_on_side_of_line(*args, **kwargs)¶
Overloaded function.
count_points_on_side_of_line(l: dlib.line, reference_point: dlib.dpoint, pts: dlib.points, dist_thresh_min: float=0, dist_thresh_max: float=inf) -> int
count_points_on_side_of_line(l: dlib.line, reference_point: dlib.dpoint, pts: dlib.dpoints, dist_thresh_min: float=0, dist_thresh_max: float=inf) -> int
- ensures
Returns a count of how many points in pts have a distance from the line l that is in the range [dist_thresh_min, dist_thresh_max]. This distance is a signed value that indicates how far a point is from the line. Moreover, if the point is on the same side as reference_point then the distance is positive, otherwise it is negative. So for example, If this range is [0, infinity] then this function counts how many points are on the same side of l as reference_point.
- dlib.count_steps_without_decrease(time_series: object, probability_of_decrease: float = 0.51) int ¶
- requires
time_series must be a one dimensional array of real numbers.
0.5 < probability_of_decrease < 1
- ensures
If you think of the contents of time_series as a potentially noisy time series, then this function returns a count of how long the time series has gone without noticeably decreasing in value. It does this by scanning along the elements, starting from the end (i.e. time_series[-1]) to the beginning, and checking how many elements you need to examine before you are confident that the series has been decreasing in value. Here, “confident of decrease” means the probability of decrease is >= probability_of_decrease.
Setting probability_of_decrease to 0.51 means we count until we see even a small hint of decrease, whereas a larger value of 0.99 would return a larger count since it keeps going until it is nearly certain the time series is decreasing.
The max possible output from this function is len(time_series).
The implementation of this function is done using the dlib::running_gradient object, which is a tool that finds the least squares fit of a line to the time series and the confidence interval around the slope of that line. That can then be used in a simple statistical test to determine if the slope is positive or negative.
- dlib.count_steps_without_decrease_robust(time_series: object, probability_of_decrease: float = 0.51, quantile_discard: float = 0.1) int ¶
- requires
time_series must be a one dimensional array of real numbers.
0.5 < probability_of_decrease < 1
0 <= quantile_discard <= 1
- ensures
This function behaves just like count_steps_without_decrease(time_series,probability_of_decrease) except that it ignores values in the time series that are in the upper quantile_discard quantile. So for example, if the quantile discard is 0.1 then the 10% largest values in the time series are ignored.
- dlib.cross_validate_ranking_trainer(*args, **kwargs)¶
Overloaded function.
cross_validate_ranking_trainer(trainer: dlib.svm_rank_trainer, samples: dlib.ranking_pairs, folds: int) -> ranking_test
cross_validate_ranking_trainer(trainer: dlib.svm_rank_trainer_sparse, samples: dlib.sparse_ranking_pairs, folds: int) -> ranking_test
- dlib.cross_validate_sequence_segmenter(*args, **kwargs)¶
Overloaded function.
cross_validate_sequence_segmenter(samples: dlib.vectorss, segments: dlib.rangess, folds: int, params: dlib.segmenter_params=<BIO,highFeats,signed,win=5,threads=4,eps=0.1,cache=40,non-verbose,C=100>) -> dlib.segmenter_test
cross_validate_sequence_segmenter(samples: dlib.sparse_vectorss, segments: dlib.rangess, folds: int, params: dlib.segmenter_params=<BIO,highFeats,signed,win=5,threads=4,eps=0.1,cache=40,non-verbose,C=100>) -> dlib.segmenter_test
- dlib.cross_validate_trainer(*args, **kwargs)¶
Overloaded function.
cross_validate_trainer(trainer: dlib.svm_c_trainer_radial_basis, x: dlib.vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.svm_c_trainer_sparse_radial_basis, x: dlib.sparse_vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.svm_c_trainer_histogram_intersection, x: dlib.vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.svm_c_trainer_sparse_histogram_intersection, x: dlib.sparse_vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.svm_c_trainer_linear, x: dlib.vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.svm_c_trainer_sparse_linear, x: dlib.sparse_vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.rvm_trainer_radial_basis, x: dlib.vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.rvm_trainer_sparse_radial_basis, x: dlib.sparse_vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.rvm_trainer_histogram_intersection, x: dlib.vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.rvm_trainer_sparse_histogram_intersection, x: dlib.sparse_vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.rvm_trainer_linear, x: dlib.vectors, y: dlib.array, folds: int) -> dlib._binary_test
cross_validate_trainer(trainer: dlib.rvm_trainer_sparse_linear, x: dlib.sparse_vectors, y: dlib.array, folds: int) -> dlib._binary_test
- dlib.cross_validate_trainer_threaded(*args, **kwargs)¶
Overloaded function.
cross_validate_trainer_threaded(trainer: dlib.svm_c_trainer_radial_basis, x: dlib.vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.svm_c_trainer_sparse_radial_basis, x: dlib.sparse_vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.svm_c_trainer_histogram_intersection, x: dlib.vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.svm_c_trainer_sparse_histogram_intersection, x: dlib.sparse_vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.svm_c_trainer_linear, x: dlib.vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.svm_c_trainer_sparse_linear, x: dlib.sparse_vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.rvm_trainer_radial_basis, x: dlib.vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.rvm_trainer_sparse_radial_basis, x: dlib.sparse_vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.rvm_trainer_histogram_intersection, x: dlib.vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.rvm_trainer_sparse_histogram_intersection, x: dlib.sparse_vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.rvm_trainer_linear, x: dlib.vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
cross_validate_trainer_threaded(trainer: dlib.rvm_trainer_sparse_linear, x: dlib.sparse_vectors, y: dlib.array, folds: int, num_threads: int) -> dlib._binary_test
- dlib.distance_to_line(*args, **kwargs)¶
Overloaded function.
distance_to_line(l: dlib.line, p: dlib.point) -> float
distance_to_line(l: dlib.line, p: dlib.dpoint) -> float
returns abs(signed_distance_to_line(l,p))
- dlib.dot(*args, **kwargs)¶
Overloaded function.
dot(arg0: dlib.vector, arg1: dlib.vector) -> float
Compute the dot product between two dense column vectors.
dot(a: dlib.point, b: dlib.point) -> int
Returns the dot product of the points a and b.
dot(a: dlib.dpoint, b: dlib.dpoint) -> float
Returns the dot product of the points a and b.
- class dlib.dpoint¶
This object represents a single point of floating point coordinates that maps directly to a dlib::dpoint.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.dpoint, x: float, y: float) -> None
__init__(self: dlib.dpoint, p: dlib.point) -> None
__init__(self: dlib.dpoint, v: numpy.ndarray[int64]) -> None
__init__(self: dlib.dpoint, v: numpy.ndarray[float32]) -> None
__init__(self: dlib.dpoint, v: numpy.ndarray[float64]) -> None
- normalize(self: dlib.dpoint) dlib.dpoint ¶
Returns a unit normalized copy of this vector.
- property x¶
The x-coordinate of the dpoint.
- property y¶
The y-coordinate of the dpoint.
- class dlib.dpoints¶
An array of dpoint objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.dpoints) -> None
__init__(self: dlib.dpoints, arg0: dlib.dpoints) -> None
Copy constructor
__init__(self: dlib.dpoints, arg0: iterable) -> None
__init__(self: dlib.dpoints, initial_size: int) -> None
- append(self: dlib.dpoints, x: dlib.dpoint) None ¶
Add an item to the end of the list
- clear(self: dlib.dpoints) None ¶
- count(self: dlib.dpoints, x: dlib.dpoint) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.dpoints, L: dlib.dpoints) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.dpoints, arg0: list) -> None
- insert(self: dlib.dpoints, i: int, x: dlib.dpoint) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.dpoints) -> dlib.dpoint
Remove and return the last item
pop(self: dlib.dpoints, i: int) -> dlib.dpoint
Remove and return the item at index
i
- remove(self: dlib.dpoints, x: dlib.dpoint) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.dpoints, arg0: int) None ¶
- class dlib.drectangle¶
This object represents a rectangular area of an image with floating point coordinates.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.drectangle, left: float, top: float, right: float, bottom: float) -> None
__init__(self: dlib.drectangle, rect: dlib.rectangle) -> None
__init__(self: dlib.drectangle, rect: dlib.drectangle) -> None
__init__(self: dlib.drectangle) -> None
- area(self: dlib.drectangle) float ¶
- bl_corner(self: dlib.drectangle) dlib.dpoint ¶
Returns the bottom left corner of the rectangle.
- bottom(self: dlib.drectangle) float ¶
- br_corner(self: dlib.drectangle) dlib.dpoint ¶
Returns the bottom right corner of the rectangle.
- center(self: dlib.drectangle) dlib.point ¶
- contains(*args, **kwargs)¶
Overloaded function.
contains(self: dlib.drectangle, point: dlib.point) -> bool
contains(self: dlib.drectangle, point: dlib.dpoint) -> bool
contains(self: dlib.drectangle, x: int, y: int) -> bool
contains(self: dlib.drectangle, rectangle: dlib.drectangle) -> bool
- dcenter(self: dlib.drectangle) dlib.point ¶
- height(self: dlib.drectangle) float ¶
- intersect(self: dlib.drectangle, rectangle: dlib.drectangle) dlib.drectangle ¶
- is_empty(self: dlib.drectangle) bool ¶
- left(self: dlib.drectangle) float ¶
- right(self: dlib.drectangle) float ¶
- tl_corner(self: dlib.drectangle) dlib.dpoint ¶
Returns the top left corner of the rectangle.
- top(self: dlib.drectangle) float ¶
- tr_corner(self: dlib.drectangle) dlib.dpoint ¶
Returns the top right corner of the rectangle.
- width(self: dlib.drectangle) float ¶
- dlib.equalize_histogram(*args, **kwargs)¶
Overloaded function.
equalize_histogram(img: numpy.ndarray[(rows,cols),uint8]) -> numpy.ndarray[(rows,cols),uint8]
equalize_histogram(img: numpy.ndarray[(rows,cols),uint16]) -> numpy.ndarray[(rows,cols),uint16]
Returns a histogram equalized version of img.
- dlib.extract_image_4points(*args, **kwargs)¶
Overloaded function.
extract_image_4points(img: numpy.ndarray[(rows,cols),uint8], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),uint8]
extract_image_4points(img: numpy.ndarray[(rows,cols),uint16], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),uint16]
extract_image_4points(img: numpy.ndarray[(rows,cols),uint32], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),uint32]
extract_image_4points(img: numpy.ndarray[(rows,cols),uint64], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),uint64]
extract_image_4points(img: numpy.ndarray[(rows,cols),int8], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),int8]
extract_image_4points(img: numpy.ndarray[(rows,cols),int16], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),int16]
extract_image_4points(img: numpy.ndarray[(rows,cols),int32], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),int32]
extract_image_4points(img: numpy.ndarray[(rows,cols),int64], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),int64]
extract_image_4points(img: numpy.ndarray[(rows,cols),float32], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),float32]
extract_image_4points(img: numpy.ndarray[(rows,cols),float64], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols),float64]
extract_image_4points(img: numpy.ndarray[(rows,cols,3),uint8], corners: list, rows: int, columns: int) -> numpy.ndarray[(rows,cols,3),uint8]
- requires
corners is a list of dpoint or line objects.
len(corners) == 4
rows >= 0
columns >= 0
- ensures
The returned image has the given number of rows and columns.
- if (corners contains dpoints) then
The 4 points in corners define a convex quadrilateral and this function extracts that part of the input image img and returns it. Therefore, each corner of the quadrilateral is associated to a corner of the extracted image and bilinear interpolation and a projective mapping is used to transform the pixels in the quadrilateral into the output image. To determine which corners of the quadrilateral map to which corners of the returned image we fit the tightest possible rectangle to the quadrilateral and map its vertices to their nearest rectangle corners. These corners are then trivially mapped to the output image (i.e. upper left corner to upper left corner, upper right corner to upper right corner, etc.).
- else
This routine finds the 4 intersecting points of the given lines which form a convex quadrilateral and uses them as described above to extract an image. i.e. It just then calls: extract_image_4points(img, intersections_between_lines, rows, columns).
If no convex quadrilateral can be made from the given lines then this routine throws no_convex_quadrilateral.
- dlib.extract_image_chip(*args, **kwargs)¶
Overloaded function.
extract_image_chip(img: numpy.ndarray[(rows,cols),uint8], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),uint8]
extract_image_chip(img: numpy.ndarray[(rows,cols),uint16], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),uint16]
extract_image_chip(img: numpy.ndarray[(rows,cols),uint32], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),uint32]
extract_image_chip(img: numpy.ndarray[(rows,cols),uint64], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),uint64]
extract_image_chip(img: numpy.ndarray[(rows,cols),int8], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),int8]
extract_image_chip(img: numpy.ndarray[(rows,cols),int16], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),int16]
extract_image_chip(img: numpy.ndarray[(rows,cols),int32], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),int32]
extract_image_chip(img: numpy.ndarray[(rows,cols),int64], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),int64]
extract_image_chip(img: numpy.ndarray[(rows,cols),float32], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),float32]
extract_image_chip(img: numpy.ndarray[(rows,cols),float64], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols),float64]
extract_image_chip(img: numpy.ndarray[(rows,cols,3),uint8], chip_location: dlib.chip_details) -> numpy.ndarray[(rows,cols,3),uint8]
This routine is just like extract_image_chips() except it takes a single chip_details object and returns a single chip image rather than a list of images.
- dlib.extract_image_chips(*args, **kwargs)¶
Overloaded function.
extract_image_chips(img: numpy.ndarray[(rows,cols),uint8], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols),uint16], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols),uint32], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols),uint64], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols),int8], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols),int16], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols),int32], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols),int64], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols),float32], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols),float64], chip_locations: list) -> list
extract_image_chips(img: numpy.ndarray[(rows,cols,3),uint8], chip_locations: list) -> list
- requires
- for all valid i:
chip_locations[i].rect.is_empty() == false
chip_locations[i].rows*chip_locations[i].cols != 0
- ensures
This function extracts “chips” from an image. That is, it takes a list of rectangular sub-windows (i.e. chips) within an image and extracts those sub-windows, storing each into its own image. It also scales and rotates the image chips according to the instructions inside each chip_details object. It uses bilinear interpolation.
The extracted image chips are returned in a python list of numpy arrays. The length of the returned array is len(chip_locations).
- Let CHIPS be the returned array, then we have:
- for all valid i:
#CHIPS[i] == The image chip extracted from the position chip_locations[i].rect in img.
#CHIPS[i].shape(0) == chip_locations[i].rows
#CHIPS[i].shape(1) == chip_locations[i].cols
The image will have been rotated counter-clockwise by chip_locations[i].angle radians, around the center of chip_locations[i].rect, before the chip was extracted.
Any pixels in an image chip that go outside img are set to 0 (i.e. black).
- class dlib.face_recognition_model_v1¶
This object maps human faces into 128D vectors where pictures of the same person are mapped near to each other and pictures of different people are mapped far apart. The constructor loads the face recognition model from a file. The model file is available here: http://dlib.net/files/dlib_face_recognition_resnet_model_v1.dat.bz2
- __init__(self: dlib.face_recognition_model_v1, arg0: str) None ¶
- compute_face_descriptor(*args, **kwargs)¶
Overloaded function.
compute_face_descriptor(self: dlib.face_recognition_model_v1, img: numpy.ndarray[(rows,cols,3),uint8], face: dlib.full_object_detection, num_jitters: int=0, padding: float=0.25) -> dlib.vector
Takes an image and a full_object_detection that references a face in that image and converts it into a 128D face descriptor. If num_jitters>1 then each face will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor. Optionally allows to override default padding of 0.25 around the face.
compute_face_descriptor(self: dlib.face_recognition_model_v1, img: numpy.ndarray[(rows,cols,3),uint8], num_jitters: int=0) -> dlib.vector
Takes an aligned face image of size 150x150 and converts it into a 128D face descriptor.Note that the alignment should be done in the same way dlib.get_face_chip does it.If num_jitters>1 then image will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor.
compute_face_descriptor(self: dlib.face_recognition_model_v1, img: numpy.ndarray[(rows,cols,3),uint8], faces: dlib.full_object_detections, num_jitters: int=0, padding: float=0.25) -> dlib.vectors
Takes an image and an array of full_object_detections that reference faces in that image and converts them into 128D face descriptors. If num_jitters>1 then each face will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor. Optionally allows to override default padding of 0.25 around the face.
compute_face_descriptor(self: dlib.face_recognition_model_v1, batch_img: List[numpy.ndarray[(rows,cols,3),uint8]], batch_faces: List[dlib.full_object_detections], num_jitters: int=0, padding: float=0.25) -> dlib.vectorss
Takes an array of images and an array of arrays of full_object_detections. batch_faces[i] must be an array of full_object_detections corresponding to the image batch_img[i], referencing faces in that image. Every face will be converted into 128D face descriptors. If num_jitters>1 then each face will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor. Optionally allows to override default padding of 0.25 around the face.
compute_face_descriptor(self: dlib.face_recognition_model_v1, batch_img: List[numpy.ndarray[(rows,cols,3),uint8]], num_jitters: int=0) -> dlib.vectors
Takes an array of aligned images of faces of size 150_x_150.Note that the alignment should be done in the same way dlib.get_face_chip does it.Every face will be converted into 128D face descriptors. If num_jitters>1 then each face will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor.
- class dlib.fhog_object_detector¶
This object represents a sliding window histogram-of-oriented-gradients based object detector.
- __call__(self: dlib.fhog_object_detector, image: array, upsample_num_times: int = 0) dlib.rectangles ¶
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
upsample_num_times >= 0
- ensures
This function runs the object detector on the input image and returns a list of detections.
Upsamples the image upsample_num_times before running the basic detector.
- __init__(self: dlib.fhog_object_detector, arg0: str) None ¶
Loads an object detector from a file that contains the output of the train_simple_object_detector() routine or a serialized C++ object of type object_detector<scan_fhog_pyramid<pyramid_down<6>>>.
- property detection_window_height¶
- property detection_window_width¶
- property num_detectors¶
- run(self: dlib.fhog_object_detector, image: array, upsample_num_times: int = 0, adjust_threshold: float = 0.0) tuple ¶
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
upsample_num_times >= 0
- ensures
This function runs the object detector on the input image and returns a tuple of (list of detections, list of scores, list of weight_indices).
Upsamples the image upsample_num_times before running the basic detector.
- run_multiple(detectors: list, image: array, upsample_num_times: int = 0, adjust_threshold: float = 0.0) tuple ¶
- requires
detectors is a list of detectors.
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
upsample_num_times >= 0
- ensures
This function runs the list of object detectors at once on the input image and returns a tuple of (list of detections, list of scores, list of weight_indices).
Upsamples the image upsample_num_times before running the basic detector.
- save(self: dlib.fhog_object_detector, detector_output_filename: str) None ¶
Save a simple_object_detector to the provided path.
- dlib.find_bright_keypoints(xx: numpy.ndarray[rows, cols, float32], xy: numpy.ndarray[rows, cols, float32], yy: numpy.ndarray[rows, cols, float32]) numpy.ndarray[rows, cols, float32] ¶
- requires
xx, xy, and yy all have the same dimensions.
- ensures
This routine finds bright “keypoints” in an image. In general, these are bright/white localized blobs. It does this by computing the determinant of the image Hessian at each location and storing this value into the returned image if both eigenvalues of the Hessian are negative. If either eigenvalue is positive then the output value for that pixel is 0. I.e.
Let OUT denote the returned image.
- for all valid r,c:
OUT[r][c] == a number >= 0 and larger values indicate the presence of a keypoint at this pixel location.
We assume that xx, xy, and yy are the 3 second order gradients of the image in question. You can obtain these gradients using the image_gradients class.
The output image will have the same dimensions as the input images.
- dlib.find_bright_lines(xx: numpy.ndarray[rows, cols, float32], xy: numpy.ndarray[rows, cols, float32], yy: numpy.ndarray[rows, cols, float32]) tuple ¶
- requires
xx, xy, and yy all have the same dimensions.
- ensures
This routine is similar to sobel_edge_detector(), except instead of finding an edge it finds a bright/white line. For example, the border between a black piece of paper and a white table is an edge, but a curve drawn with a pencil on a piece of paper makes a line. Therefore, the output of this routine is a vector field encoded in the horz and vert images, which are returned in a tuple where the first element is horz and the second is vert.
The vector obtains a large magnitude when centered on a bright line in an image and the direction of the vector is perpendicular to the line. To be very precise, each vector points in the direction of greatest change in second derivative and the magnitude of the vector encodes the derivative magnitude in that direction. Moreover, if the second derivative is positive then the output vector is zero. This zeroing if positive gradients causes the output to be sensitive only to bright lines surrounded by darker pixels.
We assume that xx, xy, and yy are the 3 second order gradients of the image in question. You can obtain these gradients using the image_gradients class.
The output images will have the same dimensions as the input images.
- dlib.find_candidate_object_locations(image: array, rects: list, kvals: tuple = (50, 200, 3), min_size: int = 20, max_merging_iterations: int = 50) None ¶
Returns found candidate objects requires
image == an image object which is a numpy ndarray
len(kvals) == 3
kvals should be a tuple that specifies the range of k values to use. In particular, it should take the form (start, end, num) where num > 0.
- ensures
This function takes an input image and generates a set of candidate rectangles which are expected to bound any objects in the image. It does this by running a version of the segment_image() routine on the image and then reports rectangles containing each of the segments as well as rectangles containing unions of adjacent segments. The basic idea is described in the paper:
Segmentation as Selective Search for Object Recognition by Koen E. A. van de Sande, et al.
Note that this function deviates from what is described in the paper slightly. See the code for details.
The basic segmentation is performed kvals[2] times, each time with the k parameter (see segment_image() and the Felzenszwalb paper for details on k) set to a different value from the range of numbers linearly spaced between kvals[0] to kvals[1].
When doing the basic segmentations prior to any box merging, we discard all rectangles that have an area < min_size. Therefore, all outputs and subsequent merged rectangles are built out of rectangles that contain at least min_size pixels. Note that setting min_size to a smaller value than you might otherwise be interested in using can be useful since it allows a larger number of possible merged boxes to be created.
There are max_merging_iterations rounds of neighboring blob merging. Therefore, this parameter has some effect on the number of output rectangles you get, with larger values of the parameter giving more output rectangles.
This function appends the output rectangles into #rects. This means that any rectangles in rects before this function was called will still be in there after it terminates. Note further that #rects will not contain any duplicate rectangles. That is, for all valid i and j where i != j it will be true that:
#rects[i] != rects[j]
- dlib.find_dark_keypoints(xx: numpy.ndarray[rows, cols, float32], xy: numpy.ndarray[rows, cols, float32], yy: numpy.ndarray[rows, cols, float32]) numpy.ndarray[rows, cols, float32] ¶
- requires
xx, xy, and yy all have the same dimensions.
- ensures
This routine finds dark “keypoints” in an image. In general, these are dark localized blobs. It does this by computing the determinant of the image Hessian at each location and storing this value into the returned image if both eigenvalues of the Hessian are negative. If either eigenvalue is negative then the output value for that pixel is 0. I.e.
Let OUT denote the returned image.
- for all valid r,c:
OUT[r][c] == a number >= 0 and larger values indicate the presence of a keypoint at this pixel location.
We assume that xx, xy, and yy are the 3 second order gradients of the image in question. You can obtain these gradients using the image_gradients class.
The output image will have the same dimensions as the input images.
- dlib.find_dark_lines(xx: numpy.ndarray[rows, cols, float32], xy: numpy.ndarray[rows, cols, float32], yy: numpy.ndarray[rows, cols, float32]) tuple ¶
- requires
xx, xy, and yy all have the same dimensions.
- ensures
This routine is similar to sobel_edge_detector(), except instead of finding an edge it finds a dark line. For example, the border between a black piece of paper and a white table is an edge, but a curve drawn with a pencil on a piece of paper makes a line. Therefore, the output of this routine is a vector field encoded in the horz and vert images, which are returned in a tuple where the first element is horz and the second is vert.
The vector obtains a large magnitude when centered on a dark line in an image and the direction of the vector is perpendicular to the line. To be very precise, each vector points in the direction of greatest change in second derivative and the magnitude of the vector encodes the derivative magnitude in that direction. Moreover, if the second derivative is negative then the output vector is zero. This zeroing if negative gradients causes the output to be sensitive only to dark lines surrounded by darker pixels.
We assume that xx, xy, and yy are the 3 second order gradients of the image in question. You can obtain these gradients using the image_gradients class.
The output images will have the same dimensions as the input images.
- dlib.find_line_endpoints(img: numpy.ndarray[rows, cols, uint8]) dlib.points ¶
- requires
all pixels in img are set to either 255 or 0. (i.e. it must be a binary image)
- ensures
This routine finds endpoints of lines in a thinned binary image. For example, if the image was produced by skeleton() or something like a Canny edge detector then you can use find_line_endpoints() to find the pixels sitting on the ends of lines.
- dlib.find_max_global(*args, **kwargs)¶
Overloaded function.
find_max_global(f: object, bound1: list, bound2: list, is_integer_variable: list, num_function_calls: int, solver_epsilon: float=0) -> tuple
- requires
len(bound1) == len(bound2) == len(is_integer_variable)
for all valid i: bound1[i] != bound2[i]
solver_epsilon >= 0
f() is a real valued multi-variate function. It must take scalar real numbers as its arguments and the number of arguments must be len(bound1).
- ensures
This function performs global optimization on the given f() function. The goal is to maximize the following objective function:
f(x)
- subject to the constraints:
min(bound1[i],bound2[i]) <= x[i] <= max(bound1[i],bound2[i]) if (is_integer_variable[i]) then x[i] is an integer value (but still represented with float type).
find_max_global() runs until it has called f() num_function_calls times. Then it returns the best x it has found along with the corresponding output of f(). That is, it returns (best_x_seen,f(best_x_seen)). Here best_x_seen is a list containing the best arguments to f() this function has found.
find_max_global() uses a global optimization method based on a combination of non-parametric global function modeling and quadratic trust region modeling to efficiently find a global maximizer. It usually does a good job with a relatively small number of calls to f(). For more information on how it works read the documentation for dlib’s global_function_search object. However, one notable element is the solver epsilon, which you can adjust.
The search procedure will only attempt to find a global maximizer to at most solver_epsilon accuracy. Once a local maximizer is found to that accuracy the search will focus entirely on finding other maxima elsewhere rather than on further improving the current local optima found so far. That is, once a local maxima is identified to about solver_epsilon accuracy, the algorithm will spend all its time exploring the function to find other local maxima to investigate. An epsilon of 0 means it will keep solving until it reaches full floating point precision. Larger values will cause it to switch to pure global exploration sooner and therefore might be more effective if your objective function has many local maxima and you don’t care about a super high precision solution.
- Any variables that satisfy the following conditions are optimized on a log-scale:
The lower bound on the variable is > 0
The ratio of the upper bound to lower bound is > 1000
The variable is not an integer variable
We do this because it’s common to optimize machine learning models that have parameters with bounds in a range such as [1e-5 to 1e10] (e.g. the SVM C parameter) and it’s much more appropriate to optimize these kinds of variables on a log scale. So we transform them by applying log() to them and then undo the transform via exp() before invoking the function being optimized. Therefore, this transformation is invisible to the user supplied functions. In most cases, it improves the efficiency of the optimizer.
find_max_global(f: object, bound1: list, bound2: list, num_function_calls: int, solver_epsilon: float=0) -> tuple
This function simply calls the other version of find_max_global() with is_integer_variable set to False for all variables.
- dlib.find_min_global(*args, **kwargs)¶
Overloaded function.
find_min_global(f: object, bound1: list, bound2: list, is_integer_variable: list, num_function_calls: int, solver_epsilon: float=0) -> tuple
This function is just like find_max_global(), except it performs minimization rather than maximization.
find_min_global(f: object, bound1: list, bound2: list, num_function_calls: int, solver_epsilon: float=0) -> tuple
This function simply calls the other version of find_min_global() with is_integer_variable set to False for all variables.
- dlib.find_optimal_momentum_filter(sequence: object, smoothness: float = 1) dlib.momentum_filter ¶
- requires
sequences.size() != 0
for all valid i: sequences[i].size() > 4
smoothness >= 0
- ensures
This function finds the “optimal” settings of a momentum_filter based on recorded measurement data stored in sequences. Here we assume that each vector in sequences is a complete track history of some object’s measured positions. What we do is find the momentum_filter that minimizes the following objective function:
sum of abs(predicted_location[i] - measured_location[i]) + smoothness*abs(filtered_location[i]-filtered_location[i-1]) Where i is a time index.
The sum runs over all the data in sequences. So what we do is find the filter settings that produce smooth filtered trajectories but also produce filtered outputs that are as close to the measured positions as possible. The larger the value of smoothness the less jittery the filter outputs will be, but they might become biased or laggy if smoothness is set really high.
- dlib.find_optimal_rect_filter(rects: std::vector<dlib::rectangle, std::allocator<dlib::rectangle> >, smoothness: float=1) dlib.rect_filter ¶
- requires
rects.size() > 4
smoothness >= 0
- ensures
This function finds the “optimal” settings of a rect_filter based on recorded measurement data stored in rects. Here we assume that rects is a complete track history of some object’s measured positions. Essentially, what we do is find the rect_filter that minimizes the following objective function:
sum of abs(predicted_location[i] - measured_location[i]) + smoothness*abs(filtered_location[i]-filtered_location[i-1]) Where i is a time index.
The sum runs over all the data in rects. So what we do is find the filter settings that produce smooth filtered trajectories but also produce filtered outputs that are as close to the measured positions as possible. The larger the value of smoothness the less jittery the filter outputs will be, but they might become biased or laggy if smoothness is set really high.
- dlib.find_peaks(*args, **kwargs)¶
Overloaded function.
find_peaks(img: numpy.ndarray[(rows,cols),float32], non_max_suppression_radius: float, thresh: float) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),float64], non_max_suppression_radius: float, thresh: float) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),uint8], non_max_suppression_radius: float, thresh: int) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),uint16], non_max_suppression_radius: float, thresh: int) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),uint32], non_max_suppression_radius: float, thresh: int) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),uint64], non_max_suppression_radius: float, thresh: int) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),int8], non_max_suppression_radius: float, thresh: int) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),int16], non_max_suppression_radius: float, thresh: int) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),int32], non_max_suppression_radius: float, thresh: int) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),int64], non_max_suppression_radius: float, thresh: int) -> dlib.points
- requires
non_max_suppression_radius >= 0
- ensures
Scans the given image and finds all pixels with values >= thresh that are also local maximums within their 8-connected neighborhood of the image. Such pixels are collected, sorted in decreasing order of their pixel values, and then non-maximum suppression is applied to this list of points using the given non_max_suppression_radius. The final list of peaks is then returned.
- Therefore, the returned list, V, will have these properties:
len(V) == the number of peaks found in the image.
When measured in image coordinates, no elements of V are within non_max_suppression_radius distance of each other. That is, for all valid i!=j it is true that length(V[i]-V[j]) > non_max_suppression_radius.
For each element of V, that element has the maximum pixel value of all pixels in the ball centered on that pixel with radius non_max_suppression_radius.
find_peaks(img: numpy.ndarray[(rows,cols),float32], non_max_suppression_radius: float=0) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),float64], non_max_suppression_radius: float=0) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),uint8], non_max_suppression_radius: float=0) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),uint16], non_max_suppression_radius: float=0) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),uint32], non_max_suppression_radius: float=0) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),uint64], non_max_suppression_radius: float=0) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),int8], non_max_suppression_radius: float=0) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),int16], non_max_suppression_radius: float=0) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),int32], non_max_suppression_radius: float=0) -> dlib.points
find_peaks(img: numpy.ndarray[(rows,cols),int64], non_max_suppression_radius: float=0) -> dlib.points
performs: return find_peaks(img, non_max_suppression_radius, partition_pixels(img))
- dlib.find_projective_transform(*args, **kwargs)¶
Overloaded function.
find_projective_transform(from_points: dlib.dpoints, to_points: dlib.dpoints) -> dlib.point_transform_projective
- requires
len(from_points) == len(to_points)
len(from_points) >= 4
- ensures
- returns a point_transform_projective object, T, such that for all valid i:
length(T(from_points[i]) - to_points[i])
is minimized as often as possible. That is, this function finds the projective transform that maps points in from_points to points in to_points. If no projective transform exists which performs this mapping exactly then the one which minimizes the mean squared error is selected.
find_projective_transform(from_points: numpy.ndarray[(rows,cols),float32], to_points: numpy.ndarray[(rows,cols),float32]) -> dlib.point_transform_projective
- requires
from_points and to_points have two columns and the same number of rows. Moreover, they have at least 4 rows.
- ensures
- returns a point_transform_projective object, T, such that for all valid i:
length(T(dpoint(from_points[i])) - dpoint(to_points[i]))
is minimized as often as possible. That is, this function finds the projective transform that maps points in from_points to points in to_points. If no projective transform exists which performs this mapping exactly then the one which minimizes the mean squared error is selected.
find_projective_transform(from_points: numpy.ndarray[(rows,cols),float64], to_points: numpy.ndarray[(rows,cols),float64]) -> dlib.point_transform_projective
- requires
from_points and to_points have two columns and the same number of rows. Moreover, they have at least 4 rows.
- ensures
- returns a point_transform_projective object, T, such that for all valid i:
length(T(dpoint(from_points[i])) - dpoint(to_points[i]))
is minimized as often as possible. That is, this function finds the projective transform that maps points in from_points to points in to_points. If no projective transform exists which performs this mapping exactly then the one which minimizes the mean squared error is selected.
- class dlib.full_object_detection¶
This object represents the location of an object in an image along with the positions of each of its constituent parts.
- __init__(self: dlib.full_object_detection, rect: dlib.rectangle, parts: object) None ¶
- requires
rect: dlib rectangle
parts: list of dlib.point, or a dlib.points object.
- property num_parts¶
The number of parts of the object.
- part(self: dlib.full_object_detection, idx: int) dlib.point ¶
A single part of the object as a dlib point.
- parts(self: dlib.full_object_detection) dlib.points ¶
A vector of dlib points representing all of the parts.
- property rect¶
Bounding box from the underlying detector. Parts can be outside box if appropriate.
- class dlib.full_object_detections¶
An array of full_object_detection objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.full_object_detections) -> None
__init__(self: dlib.full_object_detections, arg0: dlib.full_object_detections) -> None
Copy constructor
__init__(self: dlib.full_object_detections, arg0: iterable) -> None
- append(self: dlib.full_object_detections, x: dlib.full_object_detection) None ¶
Add an item to the end of the list
- clear(self: dlib.full_object_detections) None ¶
- count(self: dlib.full_object_detections, x: dlib.full_object_detection) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.full_object_detections, L: dlib.full_object_detections) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.full_object_detections, arg0: list) -> None
- insert(self: dlib.full_object_detections, i: int, x: dlib.full_object_detection) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.full_object_detections) -> dlib.full_object_detection
Remove and return the last item
pop(self: dlib.full_object_detections, i: int) -> dlib.full_object_detection
Remove and return the item at index
i
- remove(self: dlib.full_object_detections, x: dlib.full_object_detection) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.full_object_detections, arg0: int) None ¶
- class dlib.function_evaluation¶
This object records the output of a real valued function in response to some input.
In particular, if you have a function F(x) then the function_evaluation is simply a struct that records x and the scalar value F(x).
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.function_evaluation, x: dlib.vector, y: float) -> None
__init__(self: dlib.function_evaluation, x: list, y: float) -> None
- property x¶
- property y¶
- class dlib.function_evaluation_request¶
See: http://dlib.net/dlib/global_optimization/global_function_search_abstract.h.html
- __init__(*args, **kwargs)¶
- property function_idx¶
- property has_been_evaluated¶
- set(self: dlib.function_evaluation_request, arg0: float) None ¶
- property x¶
- class dlib.function_spec¶
See: http://dlib.net/dlib/global_optimization/global_function_search_abstract.h.html
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.function_spec, bound1: dlib.vector, bound2: dlib.vector) -> None
__init__(self: dlib.function_spec, bound1: dlib.vector, bound2: dlib.vector, is_integer: List[bool]) -> None
__init__(self: dlib.function_spec, bound1: list, bound2: list) -> None
__init__(self: dlib.function_spec, bound1: list, bound2: list, is_integer: list) -> None
- property is_integer_variable¶
- property lower¶
- property upper¶
- dlib.gaussian_blur(*args, **kwargs)¶
Overloaded function.
gaussian_blur(img: numpy.ndarray[(rows,cols,3),uint8], sigma: float, max_size: int=1000) -> tuple
gaussian_blur(img: numpy.ndarray[(rows,cols),uint8], sigma: float, max_size: int=1000) -> tuple
gaussian_blur(img: numpy.ndarray[(rows,cols),uint16], sigma: float, max_size: int=1000) -> tuple
gaussian_blur(img: numpy.ndarray[(rows,cols),uint32], sigma: float, max_size: int=1000) -> tuple
gaussian_blur(img: numpy.ndarray[(rows,cols),float32], sigma: float, max_size: int=1000) -> tuple
gaussian_blur(img: numpy.ndarray[(rows,cols),float64], sigma: float, max_size: int=1000) -> tuple
- requires
sigma > 0
max_size > 0
max_size is an odd number
- ensures
Filters img with a Gaussian filter of sigma width. The actual spatial filter will be applied to pixel blocks that are at most max_size wide and max_size tall (note that this function will automatically select a smaller block size as appropriate). The results are returned. We also return a rectangle which indicates what pixels in the returned image are considered non-border pixels and therefore contain output from the filter. E.g.
filtered_img,rect = gaussian_blur(img)
would give you the filtered image and the rectangle in question.
The filter is applied to each color channel independently.
Pixels close enough to the edge of img to not have the filter still fit inside the image are set to zero.
The returned image has the same dimensions as the input image.
- dlib.get_face_chip(img: numpy.ndarray[rows, cols, 3, uint8], face: dlib.full_object_detection, size: int = 150, padding: float = 0.25) numpy.ndarray[rows, cols, 3, uint8] ¶
Takes an image and a full_object_detection that references a face in that image and returns the face as a Numpy array representing the image. The face will be rotated upright and scaled to 150x150 pixels or with the optional specified size and padding.
- dlib.get_face_chip_details(*args, **kwargs)¶
Overloaded function.
get_face_chip_details(det: dlib::full_object_detection, size: int=200, padding: float=0.2) -> dlib.chip_details
- Given a full_object_detection det, returns a chip_details object which can be
used to extract an image of given size and padding.
get_face_chip_details(dets: std::vector<dlib::full_object_detection, std::allocator<dlib::full_object_detection> >, size: int=200, padding: float=0.2) -> dlib.chip_detailss
- Given a list of full_object_detection dets, returns a chip_details object which can be
used to extract an image of given size and padding.
- dlib.get_face_chips(img: numpy.ndarray[rows, cols, 3, uint8], faces: dlib.full_object_detections, size: int = 150, padding: float = 0.25) list ¶
Takes an image and a full_object_detections object that reference faces in that image and returns the faces as a list of Numpy arrays representing the image. The faces will be rotated upright and scaled to 150x150 pixels or with the optional specified size and padding.
- dlib.get_frontal_face_detector() dlib::object_detector<dlib::scan_fhog_pyramid<dlib::pyramid_down<6u>, dlib::default_fhog_feature_extractor> > ¶
Returns the default face detector
- dlib.get_histogram(*args, **kwargs)¶
Overloaded function.
get_histogram(img: numpy.ndarray[(rows,cols),uint8], hist_size: int) -> numpy.ndarray[uint64]
get_histogram(img: numpy.ndarray[(rows,cols),uint16], hist_size: int) -> numpy.ndarray[uint64]
get_histogram(img: numpy.ndarray[(rows,cols),uint32], hist_size: int) -> numpy.ndarray[uint64]
get_histogram(img: numpy.ndarray[(rows,cols),uint64], hist_size: int) -> numpy.ndarray[uint64]
- ensures
Returns a numpy array, HIST, that contains a histogram of the pixels in img. In particular, we will have:
len(HIST) == hist_size
- for all valid i:
HIST[i] == the number of times a pixel with intensity i appears in img.
- dlib.get_rect(*args, **kwargs)¶
Overloaded function.
get_rect(img: array) -> dlib.rectangle
returns a rectangle(0,0,img.shape(1)-1,img.shape(0)-1). Therefore, it is the rectangle that bounds the image.
get_rect(ht: dlib.hough_transform) -> dlib.rectangle
returns a rectangle(0,0,ht.size()-1,ht.size()-1). Therefore, it is the rectangle that bounds the Hough transform image.
- class dlib.global_function_search¶
See: http://dlib.net/dlib/global_optimization/global_function_search_abstract.h.html
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.global_function_search, function: dlib.function_spec) -> None
__init__(self: dlib.global_function_search, functions: list) -> None
__init__(self: dlib.global_function_search, functions: list, initial_function_evals: list, relative_noise_magnitude: float) -> None
- get_best_function_eval(self: dlib.global_function_search) tuple ¶
- get_function_evaluations(self: dlib.global_function_search) tuple ¶
- get_monte_carlo_upper_bound_sample_num(self: dlib.global_function_search) int ¶
- get_next_x(self: dlib.global_function_search) dlib.function_evaluation_request ¶
- get_pure_random_search_probability(self: dlib.global_function_search) float ¶
- get_relative_noise_magnitude(self: dlib.global_function_search) float ¶
- get_solver_epsilon(self: dlib.global_function_search) float ¶
- num_functions(self: dlib.global_function_search) int ¶
- set_monte_carlo_upper_bound_sample_num(self: dlib.global_function_search, num: int) None ¶
- set_pure_random_search_probability(self: dlib.global_function_search, prob: float) None ¶
- set_relative_noise_magnitude(self: dlib.global_function_search, value: float) None ¶
- set_seed(self: dlib.global_function_search, seed: int) None ¶
- set_solver_epsilon(self: dlib.global_function_search, eps: float) None ¶
- dlib.grow_rect(rect: dlib.rectangle, num: int) dlib.rectangle ¶
return shrink_rect(rect, -num) (i.e. grows the given rectangle by expanding its border by num)
- dlib.hit_enter_to_continue() None ¶
Asks the user to hit enter to continue and pauses until they do so.
- class dlib.hough_transform¶
This object is a tool for computing the line finding version of the Hough transform given some kind of edge detection image as input. It also allows the edge pixels to be weighted such that higher weighted edge pixels contribute correspondingly more to the output of the Hough transform, allowing stronger edges to create correspondingly stronger line detections in the final Hough transform.
- __call__(*args, **kwargs)¶
Overloaded function.
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint8], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint16], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint32], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint64], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int8], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int16], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int32], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int64], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),float32], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),float64], box: dlib.rectangle) -> numpy.ndarray[(rows,cols),float32]
- requires
box.width() == size
box.height() == size
- ensures
Computes the Hough transform of the part of img contained within box. In particular, we do a grayscale version of the Hough transform where any non-zero pixel in img is treated as a potential component of a line and accumulated into the returned Hough accumulator image. However, rather than adding 1 to each relevant accumulator bin we add the value of the pixel in img to each Hough accumulator bin. This means that, if all the pixels in img are 0 or 1 then this routine performs a normal Hough transform. However, if some pixels have larger values then they will be weighted correspondingly more in the resulting Hough transform.
The returned hough transform image will be size rows by size columns.
The returned image is the Hough transform of the part of img contained in box. Each point in the Hough image corresponds to a line in the input box. In particular, the line for hough_image[y][x] is given by get_line(point(x,y)). Also, when viewing the Hough image, the x-axis gives the angle of the line and the y-axis the distance of the line from the center of the box. The conversion between Hough coordinates and angle and pixel distance can be obtained by calling get_line_properties().
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint8]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint16]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint32]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint64]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int8]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int16]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int32]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int64]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),float32]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),float64]) -> numpy.ndarray[(rows,cols),float32]
simply performs: return self(img, get_rect(img)). That is, just runs the hough transform on the whole input image.
- __init__(self: dlib.hough_transform, size_: int) None ¶
- find_pixels_voting_for_lines(*args, **kwargs)¶
Overloaded function.
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint8], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint16], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint32], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint64], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int8], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int16], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int32], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int64], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),float32], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),float64], box: dlib.rectangle, hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
- requires
box.width() == size
box.height() == size
- for all valid i:
rectangle(0,0,size-1,size-1).contains(hough_points[i]) == true (i.e. hough_points must contain points in the output Hough transform space generated by this object.)
angle_window_size >= 1
radius_window_size >= 1
- ensures
This function computes the Hough transform of the part of img contained within box. It does the same computation as __call__() defined above, except instead of accumulating into an image we create an explicit list of all the points in img that contributed to each line (i.e each point in the Hough image). To do this we take a list of Hough points as input and only record hits on these specifically identified Hough points. A typical use of find_pixels_voting_for_lines() is to first run the normal Hough transform using __call__(), then find the lines you are interested in, and then call find_pixels_voting_for_lines() to determine which pixels in the input image belong to those lines.
This routine returns a vector, CONSTITUENT_POINTS, with the following properties:
CONSTITUENT_POINTS.size == hough_points.size
- for all valid i:
Let HP[i] = centered_rect(hough_points[i], angle_window_size, radius_window_size)
Any point in img with a non-zero value that lies on a line corresponding to one of the Hough points in HP[i] is added to CONSTITUENT_POINTS[i]. Therefore, when this routine finishes, #CONSTITUENT_POINTS[i] will contain all the points in img that voted for the lines associated with the Hough accumulator bins in HP[i].
#CONSTITUENT_POINTS[i].size == the number of points in img that voted for any of the lines HP[i] in Hough space. Note, however, that if angle_window_size or radius_window_size are made so large that HP[i] overlaps HP[j] for i!=j then the overlapping regions of Hough space are assigned to HP[i] or HP[j] arbitrarily. That is, we treat HP[i] and HP[j] as disjoint even if their boxes overlap. In this case, the overlapping region is assigned to either HP[i] or HP[j] in an arbitrary manner.
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint8], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint16], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint32], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),uint64], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int8], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int16], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int32], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),int64], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),float32], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
find_pixels_voting_for_lines(self: dlib.hough_transform, img: numpy.ndarray[(rows,cols),float64], hough_points: dlib.points, angle_window_size: int=1, radius_window_size: int=1) -> list
performs: return find_pixels_voting_for_lines(img, get_rect(img), hough_points, angle_window_size, radius_window_size);
That is, just runs the routine on the whole input image.
- find_strong_hough_points(self: dlib.hough_transform, himg: numpy.ndarray[rows, cols, float32], hough_count_thresh: float, angle_nms_thresh: float, radius_nms_thresh: float) dlib.points ¶
- requires
himg has size() rows and columns.
angle_nms_thresh >= 0
radius_nms_thresh >= 0
- ensures
This routine finds strong lines in a Hough transform and performs non-maximum suppression on the detected lines. Recall that each point in Hough space is associated with a line. Therefore, this routine finds all the pixels in himg (a Hough transform image) with values >= hough_count_thresh and performs non-maximum suppression on the identified list of pixels. It does this by discarding lines that are within angle_nms_thresh degrees of a stronger line or within radius_nms_thresh distance (in terms of radius as defined by get_line_properties()) to a stronger Hough point.
The identified lines are returned as a list of coordinates in himg.
The returned points are sorted so that points with larger Hough transform values come first.
- get_best_hough_point(self: dlib.hough_transform, p: dlib.point, himg: numpy.ndarray[rows, cols, float32]) dlib.point ¶
- requires
himg has size rows and columns.
rectangle(0,0,size-1,size-1).contains(p) == true
- ensures
This function interprets himg as a Hough image and p as a point in the original image space. Given this, it finds the maximum scoring line that passes though p. That is, it checks all the Hough accumulator bins in himg corresponding to lines though p and returns the location with the largest score.
returns a point X such that get_rect(himg).contains(X) == true
- get_line(*args, **kwargs)¶
Overloaded function.
get_line(self: dlib.hough_transform, p: dlib.point) -> dlib.line
get_line(self: dlib.hough_transform, p: dlib.dpoint) -> dlib.line
- requires
rectangle(0,0,size-1,size-1).contains(p) == true (i.e. p must be a point inside the Hough accumulator array)
- ensures
returns the line segment in the original image space corresponding to Hough transform point p.
The returned points are inside rectangle(0,0,size-1,size-1).
- get_line_angle_in_degrees(*args, **kwargs)¶
Overloaded function.
get_line_angle_in_degrees(self: dlib.hough_transform, p: dlib.point) -> float
get_line_angle_in_degrees(self: dlib.hough_transform, p: dlib.dpoint) -> float
- requires
rectangle(0,0,size-1,size-1).contains(p) == true (i.e. p must be a point inside the Hough accumulator array)
- ensures
returns the angle, in degrees, of the line corresponding to the Hough transform point p.
- get_line_properties(*args, **kwargs)¶
Overloaded function.
get_line_properties(self: dlib.hough_transform, p: dlib.point) -> tuple
get_line_properties(self: dlib.hough_transform, p: dlib.dpoint) -> tuple
- requires
rectangle(0,0,size-1,size-1).contains(p) == true (i.e. p must be a point inside the Hough accumulator array)
- ensures
Converts a point in the Hough transform space into an angle, in degrees, and a radius, measured in pixels from the center of the input image.
let ANGLE_IN_DEGREES == the angle of the line corresponding to the Hough transform point p. Moreover: -90 <= ANGLE_IN_DEGREES < 90.
RADIUS == the distance from the center of the input image, measured in pixels, and the line corresponding to the Hough transform point p. Moreover: -sqrt(size*size/2) <= RADIUS <= sqrt(size*size/2)
returns a tuple of (ANGLE_IN_DEGREES, RADIUS)
- property size¶
returns the size of the Hough transforms generated by this object. In particular, this object creates Hough transform images that are size by size pixels in size.
- dlib.hysteresis_threshold(*args, **kwargs)¶
Overloaded function.
hysteresis_threshold(img: numpy.ndarray[(rows,cols),uint8], lower_thresh: int, upper_thresh: int) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),uint16], lower_thresh: int, upper_thresh: int) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),uint32], lower_thresh: int, upper_thresh: int) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),uint64], lower_thresh: int, upper_thresh: int) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),int8], lower_thresh: int, upper_thresh: int) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),int16], lower_thresh: int, upper_thresh: int) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),int32], lower_thresh: int, upper_thresh: int) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),int64], lower_thresh: int, upper_thresh: int) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),float32], lower_thresh: float, upper_thresh: float) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),float64], lower_thresh: float, upper_thresh: float) -> numpy.ndarray[(rows,cols),uint8]
Applies hysteresis thresholding to img and returns the results. In particular, pixels in img with values >= upper_thresh have an output value of 255 and all others have a value of 0 unless they are >= lower_thresh and are connected to a pixel with a value >= upper_thresh, in which case they have a value of 255. Here pixels are connected if there is a path between them composed of pixels that would receive an output of 255.
hysteresis_threshold(img: numpy.ndarray[(rows,cols),uint8]) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),uint16]) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),uint32]) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),uint64]) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),int8]) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),int16]) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),int32]) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),int64]) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),float32]) -> numpy.ndarray[(rows,cols),uint8]
hysteresis_threshold(img: numpy.ndarray[(rows,cols),float64]) -> numpy.ndarray[(rows,cols),uint8]
performs: return hysteresis_threshold(img, t1, t2) where the thresholds are first obtained by calling [t1, t2]=partition_pixels(img).
- class dlib.image_gradients¶
This class is a tool for computing first and second derivatives of an image. It does this by fitting a quadratic surface around each pixel and then computing the gradients of that quadratic surface. For the details see the paper:
Quadratic models for curved line detection in SAR CCD by Davis E. King and Rhonda D. Phillips
This technique gives very accurate gradient estimates and is also very fast since the entire gradient estimation procedure, for each type of gradient, is accomplished by cross-correlating the image with a single separable filter. This means you can compute gradients at very large scales (e.g. by fitting the quadratic to a large window, like a 99x99 window) and it still runs very quickly.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.image_gradients, scale: int) -> None
Creates this class with the provided scale. i.e. get_scale()==scale. scale must be >= 1.
__init__(self: dlib.image_gradients) -> None
Creates this class with a scale of 1. i.e. get_scale()==1
- get_scale(self: dlib.image_gradients) int ¶
When we estimate a gradient we do so by fitting a quadratic filter to a window of size get_scale()*2+1 centered on each pixel. Therefore, the scale parameter controls the size of gradients we will find. For example, a very large scale will cause the gradient_xx() to be insensitive to high frequency noise in the image while smaller scales would be more sensitive to such fluctuations in the image.
- get_x_filter(self: dlib.image_gradients) numpy.ndarray[rows, cols, float32] ¶
Returns the filter used by the indicated derivative to compute the image gradient. That is, the output gradients are found by cross correlating the returned filter with the input image.
The returned filter has get_scale()*2+1 rows and columns.
- get_xx_filter(self: dlib.image_gradients) numpy.ndarray[rows, cols, float32] ¶
Returns the filter used by the indicated derivative to compute the image gradient. That is, the output gradients are found by cross correlating the returned filter with the input image.
The returned filter has get_scale()*2+1 rows and columns.
- get_xy_filter(self: dlib.image_gradients) numpy.ndarray[rows, cols, float32] ¶
Returns the filter used by the indicated derivative to compute the image gradient. That is, the output gradients are found by cross correlating the returned filter with the input image.
The returned filter has get_scale()*2+1 rows and columns.
- get_y_filter(self: dlib.image_gradients) numpy.ndarray[rows, cols, float32] ¶
Returns the filter used by the indicated derivative to compute the image gradient. That is, the output gradients are found by cross correlating the returned filter with the input image.
The returned filter has get_scale()*2+1 rows and columns.
- get_yy_filter(self: dlib.image_gradients) numpy.ndarray[rows, cols, float32] ¶
Returns the filter used by the indicated derivative to compute the image gradient. That is, the output gradients are found by cross correlating the returned filter with the input image.
The returned filter has get_scale()*2+1 rows and columns.
- gradient_x(*args, **kwargs)¶
Overloaded function.
gradient_x(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),uint8]) -> tuple
gradient_x(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),float32]) -> tuple
Let VALID_AREA = shrink_rect(get_rect(img),get_scale()).
This routine computes the requested gradient of img at each location in VALID_AREA. The gradients are returned in a new image of the same dimensions as img. All pixels outside VALID_AREA are set to 0. VALID_AREA is also returned. I.e. we return a tuple where the first element is the gradient image and the second is VALID_AREA.
- gradient_xx(*args, **kwargs)¶
Overloaded function.
gradient_xx(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),uint8]) -> tuple
gradient_xx(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),float32]) -> tuple
Let VALID_AREA = shrink_rect(get_rect(img),get_scale()).
This routine computes the requested gradient of img at each location in VALID_AREA. The gradients are returned in a new image of the same dimensions as img. All pixels outside VALID_AREA are set to 0. VALID_AREA is also returned. I.e. we return a tuple where the first element is the gradient image and the second is VALID_AREA.
- gradient_xy(*args, **kwargs)¶
Overloaded function.
gradient_xy(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),uint8]) -> tuple
gradient_xy(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),float32]) -> tuple
Let VALID_AREA = shrink_rect(get_rect(img),get_scale()).
This routine computes the requested gradient of img at each location in VALID_AREA. The gradients are returned in a new image of the same dimensions as img. All pixels outside VALID_AREA are set to 0. VALID_AREA is also returned. I.e. we return a tuple where the first element is the gradient image and the second is VALID_AREA.
- gradient_y(*args, **kwargs)¶
Overloaded function.
gradient_y(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),uint8]) -> tuple
gradient_y(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),float32]) -> tuple
Let VALID_AREA = shrink_rect(get_rect(img),get_scale()).
This routine computes the requested gradient of img at each location in VALID_AREA. The gradients are returned in a new image of the same dimensions as img. All pixels outside VALID_AREA are set to 0. VALID_AREA is also returned. I.e. we return a tuple where the first element is the gradient image and the second is VALID_AREA.
- gradient_yy(*args, **kwargs)¶
Overloaded function.
gradient_yy(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),uint8]) -> tuple
gradient_yy(self: dlib.image_gradients, img: numpy.ndarray[(rows,cols),float32]) -> tuple
Let VALID_AREA = shrink_rect(get_rect(img),get_scale()).
This routine computes the requested gradient of img at each location in VALID_AREA. The gradients are returned in a new image of the same dimensions as img. All pixels outside VALID_AREA are set to 0. VALID_AREA is also returned. I.e. we return a tuple where the first element is the gradient image and the second is VALID_AREA.
- class dlib.image_window¶
This is a GUI window capable of showing images on the screen.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.image_window) -> None
__init__(self: dlib.image_window, arg0: dlib.fhog_object_detector) -> None
__init__(self: dlib.image_window, arg0: dlib.simple_object_detector) -> None
__init__(self: dlib.image_window, arg0: dlib.fhog_object_detector, arg1: str) -> None
__init__(self: dlib.image_window, arg0: dlib.simple_object_detector, arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),uint8]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),uint16]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),uint32]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),uint64]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),int8]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),int16]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),int32]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),int64]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),float32]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),float64]) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols,3),uint8]) -> None
Create an image window that displays the given numpy image.
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),uint8], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),uint16], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),uint32], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),uint64], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),int8], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),int16], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),int32], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),int64], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),float32], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols),float64], arg1: str) -> None
__init__(self: dlib.image_window, arg0: numpy.ndarray[(rows,cols,3),uint8], arg1: str) -> None
Create an image window that displays the given numpy image and also has the given title.
- add_overlay(*args, **kwargs)¶
Overloaded function.
add_overlay(self: dlib.image_window, rectangles: dlib.rectangles, color: dlib.rgb_pixel=rgb_pixel(255,0,0)) -> None
Add a list of rectangles to the image_window. They will be displayed as red boxes by default, but the color can be passed.
add_overlay(self: dlib.image_window, rectangle: dlib.rectangle, color: dlib.rgb_pixel=rgb_pixel(255,0,0)) -> None
Add a rectangle to the image_window. It will be displayed as a red box by default, but the color can be passed.
add_overlay(self: dlib.image_window, rectangle: dlib.drectangle, color: dlib.rgb_pixel=rgb_pixel(255,0,0)) -> None
Add a rectangle to the image_window. It will be displayed as a red box by default, but the color can be passed.
add_overlay(self: dlib.image_window, detection: dlib.full_object_detection, color: dlib.rgb_pixel=rgb_pixel(0,0,255)) -> None
Add full_object_detection parts to the image window. They will be displayed as blue lines by default, but the color can be passed.
add_overlay(self: dlib.image_window, line: dlib.line, color: dlib.rgb_pixel=rgb_pixel(255,0,0)) -> None
Add line to the image window.
add_overlay(self: dlib.image_window, objects: list, color: dlib.rgb_pixel=rgb_pixel(255,0,0)) -> None
Adds all the overlayable objects, uses the given color.
- add_overlay_circle(*args, **kwargs)¶
Overloaded function.
add_overlay_circle(self: dlib.image_window, center: dlib.point, radius: float, color: dlib.rgb_pixel=rgb_pixel(255,0,0)) -> None
Add circle to the image window.
add_overlay_circle(self: dlib.image_window, center: dlib.dpoint, radius: float, color: dlib.rgb_pixel=rgb_pixel(255,0,0)) -> None
Add circle to the image window.
- clear_overlay(self: dlib.image_window) None ¶
Remove all overlays from the image_window.
- get_next_double_click(self: dlib.image_window) object ¶
Blocks until the user double clicks on the image or closes the window. Returns a dlib.point indicating the pixel the user clicked on or None if the window as closed.
- get_next_keypress(self: dlib.image_window, get_keyboard_modifiers: bool = False) object ¶
Blocks until the user presses a key on their keyboard or the window is closed.
- ensures
- if (get_keyboard_modifiers==True) then
returns a tuple of (key_pressed, keyboard_modifiers_active)
- else
returns just the key that was pressed.
The returned key is either a str containing the letter that was pressed, or an element of the dlib.non_printable_keyboard_keys enum.
keyboard_modifiers_active, if returned, is a list of elements of the dlib.keyboard_mod_keys enum. They tell you if a key like shift was being held down or not during the button press.
If the window is closed before the user presses a key then this function returns with all outputs set to None.
- is_closed(self: dlib.image_window) bool ¶
returns true if this window has been closed, false otherwise. (Note that closed windows do not receive any callbacks at all. They are also not visible on the screen.)
- set_image(*args, **kwargs)¶
Overloaded function.
set_image(self: dlib.image_window, detector: dlib.simple_object_detector) -> None
Make the image_window display the given HOG detector’s filters.
set_image(self: dlib.image_window, detector: dlib.fhog_object_detector) -> None
Make the image_window display the given HOG detector’s filters.
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),uint8]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),uint16]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),uint32]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),uint64]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),int8]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),int16]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),int32]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),int64]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),float32]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols),float64]) -> None
set_image(self: dlib.image_window, image: numpy.ndarray[(rows,cols,3),uint8]) -> None
Make the image_window display the given image.
- set_title(self: dlib.image_window, title: str) None ¶
Set the title of the window to the given value.
- wait_for_keypress(*args, **kwargs)¶
Overloaded function.
wait_for_keypress(self: dlib.image_window, key: str) -> None
Blocks until the user presses the given key or closes the window.
wait_for_keypress(self: dlib.image_window, key: dlib::base_window::non_printable_keyboard_keys) -> None
Blocks until the user presses the given key or closes the window.
- wait_until_closed(self: dlib.image_window) None ¶
This function blocks until the window is closed.
- dlib.intersect(a: dlib.line, b: dlib.line) dlib.dpoint ¶
- ensures
returns the point of intersection between lines a and b. If no such point exists then this function returns a point with Inf values in it.
- dlib.inv(trans: dlib.point_transform_projective) dlib.point_transform_projective ¶
- ensures
If trans is an invertible transformation then this function returns a new transformation that is the inverse of trans.
- dlib.jet(*args, **kwargs)¶
Overloaded function.
jet(img: numpy.ndarray[(rows,cols),uint8]) -> numpy.ndarray[(rows,cols,3),uint8]
jet(img: numpy.ndarray[(rows,cols),uint16]) -> numpy.ndarray[(rows,cols,3),uint8]
jet(img: numpy.ndarray[(rows,cols),uint32]) -> numpy.ndarray[(rows,cols,3),uint8]
jet(img: numpy.ndarray[(rows,cols),float32]) -> numpy.ndarray[(rows,cols,3),uint8]
jet(img: numpy.ndarray[(rows,cols),float64]) -> numpy.ndarray[(rows,cols,3),uint8]
Converts a grayscale image into a jet colored image. This is an image where dark pixels are dark blue and larger values become light blue, then yellow, and then finally red as they approach the maximum pixel values.
- dlib.jitter_image(img: numpy.ndarray[rows, cols, 3, uint8], num_jitters: int = 1, disturb_colors: bool = False) list ¶
Takes an image and returns a list of jittered images.The returned list contains num_jitters images (default is 1).If disturb_colors is set to True, the colors of the image are disturbed (default is False)
- class dlib.keyboard_mod_keys¶
- KBD_MOD_ALT = keyboard_mod_keys.KBD_MOD_ALT¶
- KBD_MOD_CAPS_LOCK = keyboard_mod_keys.KBD_MOD_CAPS_LOCK¶
- KBD_MOD_CONTROL = keyboard_mod_keys.KBD_MOD_CONTROL¶
- KBD_MOD_META = keyboard_mod_keys.KBD_MOD_META¶
- KBD_MOD_NONE = keyboard_mod_keys.KBD_MOD_NONE¶
- KBD_MOD_NUM_LOCK = keyboard_mod_keys.KBD_MOD_NUM_LOCK¶
- KBD_MOD_SCROLL_LOCK = keyboard_mod_keys.KBD_MOD_SCROLL_LOCK¶
- KBD_MOD_SHIFT = keyboard_mod_keys.KBD_MOD_SHIFT¶
- __init__(self: dlib.keyboard_mod_keys, arg0: int) None ¶
- dlib.label_connected_blobs(*args, **kwargs)¶
Overloaded function.
label_connected_blobs(img: numpy.ndarray[(rows,cols),uint8], zero_pixels_are_background: bool=True, neighborhood_connectivity: int=8, connected_if_both_not_zero: bool=False) -> tuple
label_connected_blobs(img: numpy.ndarray[(rows,cols),uint16], zero_pixels_are_background: bool=True, neighborhood_connectivity: int=8, connected_if_both_not_zero: bool=False) -> tuple
label_connected_blobs(img: numpy.ndarray[(rows,cols),uint32], zero_pixels_are_background: bool=True, neighborhood_connectivity: int=8, connected_if_both_not_zero: bool=False) -> tuple
label_connected_blobs(img: numpy.ndarray[(rows,cols),uint64], zero_pixels_are_background: bool=True, neighborhood_connectivity: int=8, connected_if_both_not_zero: bool=False) -> tuple
label_connected_blobs(img: numpy.ndarray[(rows,cols),float32], zero_pixels_are_background: bool=True, neighborhood_connectivity: int=8, connected_if_both_not_zero: bool=False) -> tuple
label_connected_blobs(img: numpy.ndarray[(rows,cols),float64], zero_pixels_are_background: bool=True, neighborhood_connectivity: int=8, connected_if_both_not_zero: bool=False) -> tuple
- requires
neighborhood_connectivity == 4, 8, or 24
- ensures
This function labels each of the connected blobs in img with a unique integer label.
An image can be thought of as a graph where pixels A and B are connected if they are close to each other and satisfy some criterion like having the same value or both being non-zero. Then this function can be understood as labeling all the connected components of this pixel graph such that all pixels in a component get the same label while pixels in different components get different labels.
If zero_pixels_are_background==true then there is a special background component and all pixels with value 0 are assigned to it. Moreover, all such background pixels will always get a blob id of 0 regardless of any other considerations.
This function returns a label image and a count of the number of blobs found. I.e., if you ran this function like:
label_img, num_blobs = label_connected_blobs(img)
You would obtain the noted label image and number of blobs.
The output label_img has the same dimensions as the input image.
- for all valid r and c:
label_img[r][c] == the blob label number for pixel img[r][c].
label_img[r][c] >= 0
- if (img[r][c]==0) then
label_img[r][c] == 0
- else
label_img[r][c] != 0
- if (len(img) != 0) then
The returned num_blobs will be == label_img.max()+1 (i.e. returns a number one greater than the maximum blob id number, this is the number of blobs found.)
- else
num_blobs will be 0.
blob labels are contiguous, therefore, the number returned by this function is the number of blobs in the image (including the background blob).
- dlib.label_connected_blobs_watershed(*args, **kwargs)¶
Overloaded function.
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),uint8], background_thresh: int, smoothing: float=0) -> tuple
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),uint16], background_thresh: int, smoothing: float=0) -> tuple
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),uint32], background_thresh: int, smoothing: float=0) -> tuple
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),float32], background_thresh: float, smoothing: float=0) -> tuple
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),float64], background_thresh: float, smoothing: float=0) -> tuple
- requires
smoothing >= 0
- ensures
This routine performs a watershed segmentation of the given input image and labels each resulting flooding region with a unique integer label. It does this by marking the brightest pixels as sources of flooding and then flood fills the image outward from those sources. Each flooded area is labeled with the identity of the source pixel and flooding stops when another flooded area is reached or pixels with values < background_thresh are encountered.
The flooding will also overrun a source pixel if that source pixel has yet to label any neighboring pixels. This behavior helps to mitigate spurious splits of objects due to noise. You can further control this behavior by setting the smoothing parameter. The flooding will take place on an image that has been Gaussian blurred with a sigma==smoothing. So setting smoothing to a larger number will in general cause more regions to be merged together. Note that the smoothing parameter has no effect on the interpretation of background_thresh since the decision of “background or not background” is always made relative to the unsmoothed input image.
This function returns a tuple of the labeled image and number of blobs found. i.e. you can call it like this:
label_img, num_blobs = label_connected_blobs_watershed(img,background_thresh,smoothing)
The returned label_img will have the same dimensions as img.
- for all valid r and c:
- if (img[r][c] < background_thresh) then
label_img[r][c] == 0, (i.e. the pixel is labeled as background)
- else
label_img[r][c] == an integer value indicating the identity of the segment containing the pixel img[r][c].
The returned num_blobs is the number of labeled segments, including the background segment. Therefore, the returned number is 1+(the max value in label_img).
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),uint8]) -> tuple
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),uint16]) -> tuple
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),uint32]) -> tuple
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),float32]) -> tuple
label_connected_blobs_watershed(img: numpy.ndarray[(rows,cols),float64]) -> tuple
- This version of label_connected_blobs_watershed simple invokes:
return label_connected_blobs_watershed(img, partition_pixels(img))
- dlib.length(*args, **kwargs)¶
Overloaded function.
length(p: dlib.point) -> float
returns the distance from p to the origin, i.e. the L2 norm of p.
length(p: dlib.dpoint) -> float
returns the distance from p to the origin, i.e. the L2 norm of p.
- class dlib.line¶
This object represents a line in the 2D plane. The line is defined by two points running through it, p1 and p2. This object also includes a unit normal vector that is perpendicular to the line.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.line) -> None
p1, p2, and normal are all the 0 vector.
__init__(self: dlib.line, a: dlib.dpoint, b: dlib.dpoint) -> None
- ensures
#p1 == a
#p2 == b
#normal == A vector normal to the line passing through points a and b. Therefore, the normal vector is the vector (a-b) but unit normalized and rotated clockwise 90 degrees.
__init__(self: dlib.line, a: dlib.point, b: dlib.point) -> None
- ensures
#p1 == a
#p2 == b
#normal == A vector normal to the line passing through points a and b. Therefore, the normal vector is the vector (a-b) but unit normalized and rotated clockwise 90 degrees.
- property normal¶
returns a unit vector that is normal to the line passing through p1 and p2.
- property p1¶
returns the first endpoint of the line.
- property p2¶
returns the second endpoint of the line.
- dlib.load_grayscale_image(filename: str) numpy.ndarray[rows, cols, uint8] ¶
Takes a path and returns a numpy array containing the image, as an 8bit grayscale image.
- dlib.load_libsvm_formatted_data(file_name: str) tuple ¶
- ensures
Attempts to read a file of the given name that should contain libsvm formatted data. The data is returned as a tuple where the first tuple element is an array of sparse vectors and the second element is an array of labels.
- dlib.load_rgb_image(filename: str) numpy.ndarray[rows, cols, 3, uint8] ¶
Takes a path and returns a numpy array (RGB) containing the image
- dlib.make_bounding_box_regression_training_data(truth: dlib.image_dataset_metadata.dataset, detections: object) dlib.image_dataset_metadata.dataset ¶
- requires
len(truth.images) == len(detections)
detections == A dlib.rectangless object or a list of dlib.rectangles.
- ensures
Suppose you have an object detector that can roughly locate objects in an image. This means your detector draws boxes around objects, but these are rough boxes in the sense that they aren’t positioned super accurately. For instance, HOG based detectors usually have a stride of 8 pixels. So the positional accuracy is going to be, at best, +/-8 pixels.
If you want to get better positional accuracy one easy thing to do is train a shape_predictor to give you the corners of the object. The make_bounding_box_regression_training_data() routine helps you do this by creating an appropriate training dataset. It does this by taking the dataset you used to train your detector (the truth object), and combining that with the output of your detector on each image in the training dataset (the detections object). In particular, it will create a new annotated dataset where each object box is one of the rectangles from detections and that object has 4 part annotations, the corners of the truth rectangle corresponding to that detection rectangle. You can then take the returned dataset and train a shape_predictor on it. The resulting shape_predictor can then be used to do bounding box regression.
We assume that detections[i] contains object detections corresponding to the image truth.images[i].
- dlib.make_sparse_vector(*args, **kwargs)¶
Overloaded function.
make_sparse_vector(arg0: dlib.sparse_vector) -> None
This function modifies its argument so that it is a properly sorted sparse vector. This means that the elements of the sparse vector will be ordered so that pairs with smaller indices come first. Additionally, there won’t be any pairs with identical indices. If such pairs were present in the input sparse vector then their values will be added together and only one pair with their index will be present in the output.
make_sparse_vector(arg0: dlib.sparse_vectors) -> None
This function modifies a sparse_vectors object so that all elements it contains are properly sorted sparse vectors.
- class dlib.matrix¶
This object represents a dense 2D matrix of floating point numbers.Moreover, it binds directly to the C++ type dlib::matrix<double>.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.matrix) -> None
__init__(self: dlib.matrix, arg0: list) -> None
__init__(self: dlib.matrix, arg0: object) -> None
__init__(self: dlib.matrix, arg0: int, arg1: int) -> None
- deserialize(self: dlib.matrix, file: str) None ¶
Deserialize the matrix from a file
- nc(self: dlib.matrix) int ¶
Return the number of columns in the matrix.
- nr(self: dlib.matrix) int ¶
Return the number of rows in the matrix.
- serialize(self: dlib.matrix, file: str) None ¶
Serialize the matrix to a file
- set_size(self: dlib.matrix, rows: int, cols: int) None ¶
Set the size of the matrix to the given number of rows and columns.
- property shape¶
- dlib.max_cost_assignment(cost: dlib.matrix) list ¶
- requires
cost.nr() == cost.nc() (i.e. the input must be a square matrix)
- ensures
Finds and returns the solution to the following optimization problem:
Maximize: f(A) == assignment_cost(cost, A) Subject to the following constraints:
The elements of A are unique. That is, there aren’t any elements of A which are equal.
len(A) == cost.nr()
Note that this function converts the input cost matrix into a 64bit fixed point representation. Therefore, you should make sure that the values in your cost matrix can be accurately represented by 64bit fixed point values. If this is not the case then the solution my become inaccurate due to rounding error. In general, this function will work properly when the ratio of the largest to the smallest value in cost is no more than about 1e16.
- dlib.max_index_plus_one(v: dlib.sparse_vector) int ¶
- ensures
returns the dimensionality of the given sparse vector. That is, returns a number one larger than the maximum index value in the vector. If the vector is empty then returns 0.
- dlib.max_point(*args, **kwargs)¶
Overloaded function.
max_point(img: numpy.ndarray[(rows,cols),uint8]) -> dlib.dpoint
max_point(img: numpy.ndarray[(rows,cols),uint16]) -> dlib.dpoint
max_point(img: numpy.ndarray[(rows,cols),uint32]) -> dlib.dpoint
max_point(img: numpy.ndarray[(rows,cols),uint64]) -> dlib.dpoint
max_point(img: numpy.ndarray[(rows,cols),int8]) -> dlib.dpoint
max_point(img: numpy.ndarray[(rows,cols),int16]) -> dlib.dpoint
max_point(img: numpy.ndarray[(rows,cols),int32]) -> dlib.dpoint
max_point(img: numpy.ndarray[(rows,cols),int64]) -> dlib.dpoint
max_point(img: numpy.ndarray[(rows,cols),float32]) -> dlib.dpoint
max_point(img: numpy.ndarray[(rows,cols),float64]) -> dlib.dpoint
- requires
m.size > 0
- ensures
returns the location of the maximum element of the array, that is, if the returned point is P then it will be the case that: img[P.y,P.x] == img.max().
- dlib.max_point_interpolated(*args, **kwargs)¶
Overloaded function.
max_point_interpolated(img: numpy.ndarray[(rows,cols),uint8]) -> dlib.dpoint
max_point_interpolated(img: numpy.ndarray[(rows,cols),uint16]) -> dlib.dpoint
max_point_interpolated(img: numpy.ndarray[(rows,cols),uint32]) -> dlib.dpoint
max_point_interpolated(img: numpy.ndarray[(rows,cols),uint64]) -> dlib.dpoint
max_point_interpolated(img: numpy.ndarray[(rows,cols),int8]) -> dlib.dpoint
max_point_interpolated(img: numpy.ndarray[(rows,cols),int16]) -> dlib.dpoint
max_point_interpolated(img: numpy.ndarray[(rows,cols),int32]) -> dlib.dpoint
max_point_interpolated(img: numpy.ndarray[(rows,cols),int64]) -> dlib.dpoint
max_point_interpolated(img: numpy.ndarray[(rows,cols),float32]) -> dlib.dpoint
max_point_interpolated(img: numpy.ndarray[(rows,cols),float64]) -> dlib.dpoint
- requires
m.size > 0
- ensures
Like max_point(), this function finds the location in m with the largest value. However, we additionally use some quadratic interpolation to find the location of the maximum point with sub-pixel accuracy. Therefore, the returned point is equal to max_point(m) + some small sub-pixel delta.
- dlib.min_barrier_distance(*args, **kwargs)¶
Overloaded function.
min_barrier_distance(img: numpy.ndarray[(rows,cols),uint8], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),uint8]
min_barrier_distance(img: numpy.ndarray[(rows,cols),uint16], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),uint16]
min_barrier_distance(img: numpy.ndarray[(rows,cols),uint32], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),uint32]
min_barrier_distance(img: numpy.ndarray[(rows,cols),uint64], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),uint64]
min_barrier_distance(img: numpy.ndarray[(rows,cols),int8], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),int8]
min_barrier_distance(img: numpy.ndarray[(rows,cols),int16], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),int16]
min_barrier_distance(img: numpy.ndarray[(rows,cols),int32], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),int32]
min_barrier_distance(img: numpy.ndarray[(rows,cols),int64], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),int64]
min_barrier_distance(img: numpy.ndarray[(rows,cols),float32], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),float32]
min_barrier_distance(img: numpy.ndarray[(rows,cols),float64], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),float64]
min_barrier_distance(img: numpy.ndarray[(rows,cols,3),uint8], iterations: int=10, do_left_right_scans: bool=True) -> numpy.ndarray[(rows,cols),uint8]
- requires
iterations > 0
- ensures
- This function implements the salient object detection method described in the paper:
“Minimum barrier salient object detection at 80 fps” by Zhang, Jianming, et al.
In particular, we compute the minimum barrier distance between the borders of the image and all the other pixels. The resulting image is returned. Note that the paper talks about a bunch of other things you could do beyond computing the minimum barrier distance, but this function doesn’t do any of that. It’s just the vanilla MBD.
We will perform iterations iterations of MBD passes over the image. Larger values might give better results but run slower.
During each MBD iteration we make raster scans over the image. These pass from top->bottom, bottom->top, left->right, and right->left. If do_left_right_scans==false then the left/right passes are not executed. Skipping them makes the algorithm about 2x faster but might reduce the quality of the output.
- class dlib.mmod_rectangle¶
Wrapper around a rectangle object and a detection confidence score.
- __init__(*args, **kwargs)¶
- property confidence¶
- property rect¶
- class dlib.mmod_rectangles¶
An array of mmod rectangle objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.mmod_rectangles) -> None
__init__(self: dlib.mmod_rectangles, arg0: dlib.mmod_rectangles) -> None
Copy constructor
__init__(self: dlib.mmod_rectangles, arg0: iterable) -> None
- append(self: dlib.mmod_rectangles, x: dlib.mmod_rectangle) None ¶
Add an item to the end of the list
- count(self: dlib.mmod_rectangles, x: dlib.mmod_rectangle) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.mmod_rectangles, L: dlib.mmod_rectangles) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.mmod_rectangles, arg0: list) -> None
- insert(self: dlib.mmod_rectangles, i: int, x: dlib.mmod_rectangle) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.mmod_rectangles) -> dlib.mmod_rectangle
Remove and return the last item
pop(self: dlib.mmod_rectangles, i: int) -> dlib.mmod_rectangle
Remove and return the item at index
i
- remove(self: dlib.mmod_rectangles, x: dlib.mmod_rectangle) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- class dlib.mmod_rectangless¶
A 2D array of mmod rectangle objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.mmod_rectangless) -> None
__init__(self: dlib.mmod_rectangless, arg0: dlib.mmod_rectangless) -> None
Copy constructor
__init__(self: dlib.mmod_rectangless, arg0: iterable) -> None
- append(self: dlib.mmod_rectangless, x: dlib.mmod_rectangles) None ¶
Add an item to the end of the list
- count(self: dlib.mmod_rectangless, x: dlib.mmod_rectangles) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.mmod_rectangless, L: dlib.mmod_rectangless) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.mmod_rectangless, arg0: list) -> None
- insert(self: dlib.mmod_rectangless, i: int, x: dlib.mmod_rectangles) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.mmod_rectangless) -> dlib.mmod_rectangles
Remove and return the last item
pop(self: dlib.mmod_rectangless, i: int) -> dlib.mmod_rectangles
Remove and return the item at index
i
- remove(self: dlib.mmod_rectangless, x: dlib.mmod_rectangles) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- class dlib.momentum_filter¶
This object is a simple tool for filtering a single scalar value that measures the location of a moving object that has some non-trivial momentum. Importantly, the measurements are noisy and the object can experience sudden unpredictable accelerations. To accomplish this filtering we use a simple Kalman filter with a state transition model of:
position_{i+1} = position_{i} + velocity_{i} velocity_{i+1} = velocity_{i} + some_unpredictable_acceleration
and a measurement model of:
measured_position_{i} = position_{i} + measurement_noise
Where some_unpredictable_acceleration and measurement_noise are 0 mean Gaussian noise sources with standard deviations of get_typical_acceleration() and get_measurement_noise() respectively.
To allow for really sudden and large but infrequent accelerations, at each step we check if the current measured position deviates from the predicted filtered position by more than get_max_measurement_deviation()*get_measurement_noise() and if so we adjust the filter’s state to keep it within these bounds. This allows the moving object to undergo large unmodeled accelerations, far in excess of what would be suggested by get_typical_acceleration(), without then experiencing a long lag time where the Kalman filter has to “catch up” to the new position.
- __call__(self: dlib.momentum_filter, arg0: float) float ¶
- __init__(self: dlib.momentum_filter, measurement_noise: float, typical_acceleration: float, max_measurement_deviation: float) None ¶
- max_measurement_deviation(self: dlib.momentum_filter) float ¶
- measurement_noise(self: dlib.momentum_filter) float ¶
- typical_acceleration(self: dlib.momentum_filter) float ¶
- exception dlib.no_convex_quadrilateral¶
- class dlib.non_printable_keyboard_keys¶
- KEY_ALT = non_printable_keyboard_keys.KEY_ALT¶
- KEY_BACKSPACE = non_printable_keyboard_keys.KEY_BACKSPACE¶
- KEY_CAPS_LOCK = non_printable_keyboard_keys.KEY_CAPS_LOCK¶
- KEY_CTRL = non_printable_keyboard_keys.KEY_CTRL¶
- KEY_DELETE = non_printable_keyboard_keys.KEY_DELETE¶
- KEY_DOWN = non_printable_keyboard_keys.KEY_DOWN¶
- KEY_END = non_printable_keyboard_keys.KEY_END¶
- KEY_ESC = non_printable_keyboard_keys.KEY_ESC¶
- KEY_F1 = non_printable_keyboard_keys.KEY_F1¶
- KEY_F10 = non_printable_keyboard_keys.KEY_F10¶
- KEY_F11 = non_printable_keyboard_keys.KEY_F11¶
- KEY_F12 = non_printable_keyboard_keys.KEY_F12¶
- KEY_F2 = non_printable_keyboard_keys.KEY_F2¶
- KEY_F3 = non_printable_keyboard_keys.KEY_F3¶
- KEY_F4 = non_printable_keyboard_keys.KEY_F4¶
- KEY_F5 = non_printable_keyboard_keys.KEY_F5¶
- KEY_F6 = non_printable_keyboard_keys.KEY_F6¶
- KEY_F7 = non_printable_keyboard_keys.KEY_F7¶
- KEY_F8 = non_printable_keyboard_keys.KEY_F8¶
- KEY_F9 = non_printable_keyboard_keys.KEY_F9¶
- KEY_HOME = non_printable_keyboard_keys.KEY_HOME¶
- KEY_INSERT = non_printable_keyboard_keys.KEY_INSERT¶
- KEY_LEFT = non_printable_keyboard_keys.KEY_LEFT¶
- KEY_PAGE_DOWN = non_printable_keyboard_keys.KEY_PAGE_DOWN¶
- KEY_PAGE_UP = non_printable_keyboard_keys.KEY_PAGE_UP¶
- KEY_PAUSE = non_printable_keyboard_keys.KEY_PAUSE¶
- KEY_RIGHT = non_printable_keyboard_keys.KEY_RIGHT¶
- KEY_SCROLL_LOCK = non_printable_keyboard_keys.KEY_SCROLL_LOCK¶
- KEY_SHIFT = non_printable_keyboard_keys.KEY_SHIFT¶
- KEY_UP = non_printable_keyboard_keys.KEY_UP¶
- __init__(self: dlib.non_printable_keyboard_keys, arg0: int) None ¶
- dlib.normalize_image_gradients(*args, **kwargs)¶
Overloaded function.
normalize_image_gradients(img1: numpy.ndarray[(rows,cols),float64], img2: numpy.ndarray[(rows,cols),float64]) -> None
normalize_image_gradients(img1: numpy.ndarray[(rows,cols),float32], img2: numpy.ndarray[(rows,cols),float32]) -> None
- requires
img1 and img2 have the same dimensions.
- ensures
This function assumes img1 and img2 are the two gradient images produced by a function like sobel_edge_detector(). It then unit normalizes the gradient vectors. That is, for all valid r and c, this function ensures that:
img1[r][c]*img1[r][c] + img2[r][c]*img2[r][c] == 1 unless both img1[r][c] and img2[r][c] were 0 initially, then they stay zero.
- dlib.num_separable_filters(detector: dlib.simple_object_detector) int ¶
Returns the number of separable filters necessary to represent the HOG filters in the given detector.
- class dlib.pair¶
This object is used to represent the elements of a sparse_vector.
- property first¶
This field represents the index/dimension number.
- property second¶
This field contains the value in a vector at dimension specified by the first field.
- dlib.partition_pixels(*args, **kwargs)¶
Overloaded function.
partition_pixels(img: numpy.ndarray[(rows,cols,3),uint8]) -> int
partition_pixels(img: numpy.ndarray[(rows,cols),uint8]) -> int
partition_pixels(img: numpy.ndarray[(rows,cols),uint16]) -> int
partition_pixels(img: numpy.ndarray[(rows,cols),uint32]) -> int
partition_pixels(img: numpy.ndarray[(rows,cols),float32]) -> float
partition_pixels(img: numpy.ndarray[(rows,cols),float64]) -> float
Finds a threshold value that would be reasonable to use with threshold_image(img, threshold). It does this by finding the threshold that partitions the pixels in img into two groups such that the sum of absolute deviations between each pixel and the mean of its group is minimized.
partition_pixels(img: numpy.ndarray[(rows,cols,3),uint8], num_thresholds: int) -> tuple
partition_pixels(img: numpy.ndarray[(rows,cols),uint8], num_thresholds: int) -> tuple
partition_pixels(img: numpy.ndarray[(rows,cols),uint16], num_thresholds: int) -> tuple
partition_pixels(img: numpy.ndarray[(rows,cols),uint32], num_thresholds: int) -> tuple
partition_pixels(img: numpy.ndarray[(rows,cols),float32], num_thresholds: int) -> tuple
partition_pixels(img: numpy.ndarray[(rows,cols),float64], num_thresholds: int) -> tuple
This version of partition_pixels() finds multiple partitions rather than just one partition. It does this by first partitioning the pixels just as the above partition_pixels(img) does. Then it forms a new image with only pixels >= that first partition value and recursively partitions this new image. However, the recursion is implemented in an efficient way which is faster than explicitly forming these images and calling partition_pixels(), but the output is the same as if you did. For example, suppose you called [t1,t2,t2] = partition_pixels(img,3). Then we would have:
t1 == partition_pixels(img)
t2 == partition_pixels(an image with only pixels with values >= t1 in it)
t3 == partition_pixels(an image with only pixels with values >= t2 in it)
- class dlib.point¶
This object represents a single point of integer coordinates that maps directly to a dlib::point.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.point, x: int, y: int) -> None
__init__(self: dlib.point, p: dlib::vector<double, 2l>) -> None
__init__(self: dlib.point, v: numpy.ndarray[int64]) -> None
__init__(self: dlib.point, v: numpy.ndarray[float32]) -> None
__init__(self: dlib.point, v: numpy.ndarray[float64]) -> None
- normalize(self: dlib.point) dlib::vector<double, 2l> ¶
Returns a unit normalized copy of this vector.
- property x¶
The x-coordinate of the point.
- property y¶
The y-coordinate of the point.
- class dlib.point_transform_projective¶
This is an object that takes 2D points and applies a projective transformation to them.
- __call__(self: dlib.point_transform_projective, p: dlib.dpoint) dlib.dpoint ¶
- ensures
Applies the projective transformation defined by this object’s constructor to p and returns the result. To define this precisely:
- let p_h == the point p in homogeneous coordinates. That is:
p_h.x == p.x
p_h.y == p.y
p_h.z == 1
let x == m*p_h
Then this function returns the value x/x.z
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.point_transform_projective) -> None
- ensures
This object will perform the identity transform. That is, given a point as input it will return the same point as output. Therefore, self.m == a 3x3 identity matrix.
__init__(self: dlib.point_transform_projective, m: numpy.ndarray[(rows,cols),float64]) -> None
- ensures
self.m == m
- property m¶
m is the 3x3 matrix that defines the projective transformation.
- class dlib.points¶
An array of point objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.points) -> None
__init__(self: dlib.points, arg0: dlib.points) -> None
Copy constructor
__init__(self: dlib.points, arg0: iterable) -> None
__init__(self: dlib.points, initial_size: int) -> None
- append(self: dlib.points, x: dlib.point) None ¶
Add an item to the end of the list
- clear(self: dlib.points) None ¶
- count(self: dlib.points, x: dlib.point) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.points, L: dlib.points) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.points, arg0: list) -> None
- insert(self: dlib.points, i: int, x: dlib.point) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.points) -> dlib.point
Remove and return the last item
pop(self: dlib.points, i: int) -> dlib.point
Remove and return the item at index
i
- remove(self: dlib.points, x: dlib.point) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.points, arg0: int) None ¶
- dlib.polygon_area(*args, **kwargs)¶
Overloaded function.
polygon_area(pts: dlib.dpoints) -> float
polygon_area(pts: list) -> float
- ensures
If you walk the points pts in order to make a closed polygon, what is its area? This function returns that area. It uses the shoelace formula to compute the result and so works for general non-self-intersecting polygons.
- dlib.probability_that_sequence_is_increasing(time_series: object) float ¶
returns the probability that the given sequence of real numbers is increasing in value over time.
- class dlib.pyramid_down¶
This is a simple object to help create image pyramids. In particular, it downsamples images at a ratio of N to N-1.
Note that setting N to 1 means that this object functions like pyramid_disable (defined at the bottom of this file).
WARNING, when mapping rectangles from one layer of a pyramid to another you might end up with rectangles which extend slightly outside your images. This is because points on the border of an image at a higher pyramid layer might correspond to points outside images at lower layers. So just keep this in mind. Note also that it’s easy to deal with. Just say something like this:
rect = rect.intersect(get_rect(my_image)); # keep rect inside my_image
- __call__(*args, **kwargs)¶
Overloaded function.
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),uint8]) -> numpy.ndarray[(rows,cols),uint8]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),uint16]) -> numpy.ndarray[(rows,cols),uint16]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),uint32]) -> numpy.ndarray[(rows,cols),uint32]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),uint64]) -> numpy.ndarray[(rows,cols),uint64]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),int8]) -> numpy.ndarray[(rows,cols),int8]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),int16]) -> numpy.ndarray[(rows,cols),int16]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),int32]) -> numpy.ndarray[(rows,cols),int32]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),int64]) -> numpy.ndarray[(rows,cols),int64]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),float32]) -> numpy.ndarray[(rows,cols),float32]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols),float64]) -> numpy.ndarray[(rows,cols),float64]
__call__(self: dlib.pyramid_down, img: numpy.ndarray[(rows,cols,3),uint8]) -> numpy.ndarray[(rows,cols,3),uint8]
Downsamples img to make a new image that is roughly (pyramid_downsampling_rate()-1)/pyramid_downsampling_rate() times the size of the original image.
The location of a point P in original image will show up at point point_down(P) in the downsampled image.
Note that some points on the border of the original image might correspond to points outside the downsampled image.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.pyramid_down, N: int) -> None
Creates this class with the provided downsampling rate. i.e. pyramid_downsampling_rate()==N. N must be in the range 1 to 20.
__init__(self: dlib.pyramid_down) -> None
Creates this class with pyramid_downsampling_rate()==2
- point_down(*args, **kwargs)¶
Overloaded function.
point_down(self: dlib.pyramid_down, p: dlib.point) -> dlib.dpoint
point_down(self: dlib.pyramid_down, p: dlib.dpoint) -> dlib.dpoint
Maps from pixels in a source image to the corresponding pixels in the downsampled image.
point_down(self: dlib.pyramid_down, p: dlib.point, levels: int) -> dlib.dpoint
point_down(self: dlib.pyramid_down, p: dlib.dpoint, levels: int) -> dlib.dpoint
Applies point_down() to p levels times and returns the result.
- point_up(*args, **kwargs)¶
Overloaded function.
point_up(self: dlib.pyramid_down, p: dlib.point) -> dlib.dpoint
point_up(self: dlib.pyramid_down, p: dlib.dpoint) -> dlib.dpoint
Maps from pixels in a downsampled image to pixels in the original image.
point_up(self: dlib.pyramid_down, p: dlib.point, levels: int) -> dlib.dpoint
point_up(self: dlib.pyramid_down, p: dlib.dpoint, levels: int) -> dlib.dpoint
Applies point_up() to p levels times and returns the result.
- pyramid_downsampling_rate(self: dlib.pyramid_down) int ¶
Returns a number N that defines the downsampling rate. In particular, images are downsampled by a factor of N to N-1.
- rect_down(*args, **kwargs)¶
Overloaded function.
rect_down(self: dlib.pyramid_down, rect: dlib.rectangle) -> dlib.rectangle
rect_down(self: dlib.pyramid_down, rect: dlib.drectangle) -> dlib.drectangle
- returns drectangle(point_down(rect.tl_corner()), point_down(rect.br_corner()));
(i.e. maps rect into a downsampled)
rect_down(self: dlib.pyramid_down, rect: dlib.rectangle, levels: int) -> dlib.rectangle
rect_down(self: dlib.pyramid_down, rect: dlib.drectangle, levels: int) -> dlib.drectangle
Applies rect_down() to rect levels times and returns the result.
- rect_up(*args, **kwargs)¶
Overloaded function.
rect_up(self: dlib.pyramid_down, rect: dlib.rectangle) -> dlib.rectangle
rect_up(self: dlib.pyramid_down, rect: dlib.drectangle) -> dlib.drectangle
- returns drectangle(point_up(rect.tl_corner()), point_up(rect.br_corner()));
(i.e. maps rect into a parent image)
rect_up(self: dlib.pyramid_down, rect: dlib.rectangle, levels: int) -> dlib.rectangle
rect_up(self: dlib.pyramid_down, p: dlib.drectangle, levels: int) -> dlib.drectangle
Applies rect_up() to rect levels times and returns the result.
- dlib.randomly_color_image(*args, **kwargs)¶
Overloaded function.
randomly_color_image(img: numpy.ndarray[(rows,cols),uint8]) -> numpy.ndarray[(rows,cols,3),uint8]
randomly_color_image(img: numpy.ndarray[(rows,cols),uint16]) -> numpy.ndarray[(rows,cols,3),uint8]
randomly_color_image(img: numpy.ndarray[(rows,cols),uint32]) -> numpy.ndarray[(rows,cols,3),uint8]
randomly generates a mapping from gray level pixel values to the RGB pixel space and then uses this mapping to create a colored version of img. Returns an image which represents this colored version of img.
black pixels in img will remain black in the output image.
- class dlib.range¶
This object is used to represent a range of elements in an array.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.range, arg0: int, arg1: int) -> None
__init__(self: dlib.range, arg0: int) -> None
- property begin¶
The index of the first element in the range. This is represented using an unsigned integer.
- property end¶
One past the index of the last element in the range. This is represented using an unsigned integer.
- class dlib.ranges¶
This object is an array of range objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.ranges) -> None
__init__(self: dlib.ranges, arg0: dlib.ranges) -> None
Copy constructor
__init__(self: dlib.ranges, arg0: iterable) -> None
- append(self: dlib.ranges, x: dlib.range) None ¶
Add an item to the end of the list
- clear(self: dlib.ranges) None ¶
- count(self: dlib.ranges, x: dlib.range) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.ranges, L: dlib.ranges) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.ranges, arg0: list) -> None
- insert(self: dlib.ranges, i: int, x: dlib.range) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.ranges) -> dlib.range
Remove and return the last item
pop(self: dlib.ranges, i: int) -> dlib.range
Remove and return the item at index
i
- remove(self: dlib.ranges, x: dlib.range) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.ranges, arg0: int) None ¶
- class dlib.rangess¶
This object is an array of arrays of range objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.rangess) -> None
__init__(self: dlib.rangess, arg0: dlib.rangess) -> None
Copy constructor
__init__(self: dlib.rangess, arg0: iterable) -> None
- append(self: dlib.rangess, x: dlib.ranges) None ¶
Add an item to the end of the list
- clear(self: dlib.rangess) None ¶
- count(self: dlib.rangess, x: dlib.ranges) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.rangess, L: dlib.rangess) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.rangess, arg0: list) -> None
- insert(self: dlib.rangess, i: int, x: dlib.ranges) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.rangess) -> dlib.ranges
Remove and return the last item
pop(self: dlib.rangess, i: int) -> dlib.ranges
Remove and return the item at index
i
- remove(self: dlib.rangess, x: dlib.ranges) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.rangess, arg0: int) None ¶
- class dlib.ranking_pair¶
- __init__(self: dlib.ranking_pair) None ¶
- property nonrelevant¶
- property relevant¶
- class dlib.ranking_pairs¶
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.ranking_pairs) -> None
__init__(self: dlib.ranking_pairs, arg0: dlib.ranking_pairs) -> None
Copy constructor
__init__(self: dlib.ranking_pairs, arg0: iterable) -> None
- append(self: dlib.ranking_pairs, x: dlib.ranking_pair) None ¶
Add an item to the end of the list
- clear(self: dlib.ranking_pairs) None ¶
- count(self: dlib.ranking_pairs, x: dlib.ranking_pair) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.ranking_pairs, L: dlib.ranking_pairs) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.ranking_pairs, arg0: list) -> None
- insert(self: dlib.ranking_pairs, i: int, x: dlib.ranking_pair) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.ranking_pairs) -> dlib.ranking_pair
Remove and return the last item
pop(self: dlib.ranking_pairs, i: int) -> dlib.ranking_pair
Remove and return the item at index
i
- remove(self: dlib.ranking_pairs, x: dlib.ranking_pair) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.ranking_pairs, arg0: int) None ¶
- class dlib.rect_filter¶
This object is a simple tool for filtering a rectangle that measures the location of a moving object that has some non-trivial momentum. Importantly, the measurements are noisy and the object can experience sudden unpredictable accelerations. To accomplish this filtering we use a simple Kalman filter with a state transition model of:
position_{i+1} = position_{i} + velocity_{i} velocity_{i+1} = velocity_{i} + some_unpredictable_acceleration
and a measurement model of:
measured_position_{i} = position_{i} + measurement_noise
Where some_unpredictable_acceleration and measurement_noise are 0 mean Gaussian noise sources with standard deviations of typical_acceleration and measurement_noise respectively.
To allow for really sudden and large but infrequent accelerations, at each step we check if the current measured position deviates from the predicted filtered position by more than max_measurement_deviation*measurement_noise and if so we adjust the filter’s state to keep it within these bounds. This allows the moving object to undergo large unmodeled accelerations, far in excess of what would be suggested by typical_acceleration, without then experiencing a long lag time where the Kalman filter has to “catches up” to the new position.
- __call__(self: dlib.rect_filter, rect: dlib.rectangle) dlib.rectangle ¶
- __init__(self: dlib.rect_filter, measurement_noise: float, typical_acceleration: float, max_measurement_deviation: float) None ¶
- max_measurement_deviation(self: dlib.rect_filter) float ¶
- measurement_noise(self: dlib.rect_filter) float ¶
- typical_acceleration(self: dlib.rect_filter) float ¶
- class dlib.rectangle¶
This object represents a rectangular area of an image.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.rectangle, left: int, top: int, right: int, bottom: int) -> None
__init__(self: dlib.rectangle, rect: dlib::drectangle) -> None
__init__(self: dlib.rectangle, rect: dlib.rectangle) -> None
__init__(self: dlib.rectangle) -> None
- area(self: dlib.rectangle) int ¶
- bl_corner(self: dlib.rectangle) dlib.point ¶
Returns the bottom left corner of the rectangle.
- bottom(self: dlib.rectangle) int ¶
- br_corner(self: dlib.rectangle) dlib.point ¶
Returns the bottom right corner of the rectangle.
- center(self: dlib.rectangle) dlib.point ¶
- contains(*args, **kwargs)¶
Overloaded function.
contains(self: dlib.rectangle, point: dlib.point) -> bool
contains(self: dlib.rectangle, point: dlib.dpoint) -> bool
contains(self: dlib.rectangle, x: int, y: int) -> bool
contains(self: dlib.rectangle, rectangle: dlib.rectangle) -> bool
- dcenter(self: dlib.rectangle) dlib.point ¶
- height(self: dlib.rectangle) int ¶
- intersect(self: dlib.rectangle, rectangle: dlib.rectangle) dlib.rectangle ¶
- is_empty(self: dlib.rectangle) bool ¶
- left(self: dlib.rectangle) int ¶
- right(self: dlib.rectangle) int ¶
- tl_corner(self: dlib.rectangle) dlib.point ¶
Returns the top left corner of the rectangle.
- top(self: dlib.rectangle) int ¶
- tr_corner(self: dlib.rectangle) dlib.point ¶
Returns the top right corner of the rectangle.
- width(self: dlib.rectangle) int ¶
- class dlib.rectangles¶
An array of rectangle objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.rectangles) -> None
__init__(self: dlib.rectangles, arg0: dlib.rectangles) -> None
Copy constructor
__init__(self: dlib.rectangles, arg0: iterable) -> None
__init__(self: dlib.rectangles, initial_size: int) -> None
- append(self: dlib.rectangles, x: dlib.rectangle) None ¶
Add an item to the end of the list
- clear(self: dlib.rectangles) None ¶
- count(self: dlib.rectangles, x: dlib.rectangle) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.rectangles, L: dlib.rectangles) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.rectangles, arg0: list) -> None
- insert(self: dlib.rectangles, i: int, x: dlib.rectangle) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.rectangles) -> dlib.rectangle
Remove and return the last item
pop(self: dlib.rectangles, i: int) -> dlib.rectangle
Remove and return the item at index
i
- remove(self: dlib.rectangles, x: dlib.rectangle) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.rectangles, arg0: int) None ¶
- class dlib.rectangless¶
An array of arrays of rectangle objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.rectangless) -> None
__init__(self: dlib.rectangless, arg0: dlib.rectangless) -> None
Copy constructor
__init__(self: dlib.rectangless, arg0: iterable) -> None
__init__(self: dlib.rectangless, initial_size: int) -> None
- append(self: dlib.rectangless, x: dlib.rectangles) None ¶
Add an item to the end of the list
- clear(self: dlib.rectangless) None ¶
- count(self: dlib.rectangless, x: dlib.rectangles) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.rectangless, L: dlib.rectangless) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.rectangles, arg0: list) -> None
- insert(self: dlib.rectangless, i: int, x: dlib.rectangles) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.rectangless) -> dlib.rectangles
Remove and return the last item
pop(self: dlib.rectangless, i: int) -> dlib.rectangles
Remove and return the item at index
i
- remove(self: dlib.rectangless, x: dlib.rectangles) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.rectangless, arg0: int) None ¶
- dlib.reduce(*args, **kwargs)¶
Overloaded function.
reduce(df: dlib._normalized_decision_function_radial_basis, x: dlib.vectors, num_basis_vectors: int, eps: float=0.001) -> dlib._normalized_decision_function_radial_basis
reduce(df: dlib._normalized_decision_function_radial_basis, x: numpy.ndarray[(rows,cols),float64], num_basis_vectors: int, eps: float=0.001) -> dlib._normalized_decision_function_radial_basis
- requires
eps > 0
num_bv > 0
- ensures
This routine takes a learned radial basis function and tries to find a new RBF function with num_basis_vectors basis vectors that approximates the given df() as closely as possible. In particular, it finds a function new_df() such that new_df(x[i])==df(x[i]) as often as possible.
This is accomplished using a reduced set method that begins by using a projection, in kernel space, onto a random set of num_basis_vectors vectors in x. Then, L-BFGS is used to further optimize new_df() to match df(). The eps parameter controls how long L-BFGS will run, smaller values of eps possibly giving better solutions but taking longer to execute.
- dlib.remove_incoherent_edge_pixels(line: dlib.points, horz_gradient: numpy.ndarray[rows, cols, float32], vert_gradient: numpy.ndarray[rows, cols, float32], angle_thresh: float) dlib.points ¶
- requires
horz_gradient and vert_gradient have the same dimensions.
horz_gradient and vert_gradient represent unit normalized vectors. That is, you should have called normalize_image_gradients(horz_gradient,vert_gradient) or otherwise caused all the gradients to have unit norm.
- for all valid i:
get_rect(horz_gradient).contains(line[i])
- ensures
This routine looks at all the points in the given line and discards the ones that have outlying gradient directions. To be specific, this routine returns a set of points PTS such that:
- for all valid i,j:
The difference in angle between the gradients for PTS[i] and PTS[j] is less than angle_threshold degrees.
len(PTS) <= len(line)
PTS is just line with some elements removed.
- dlib.resize_image(*args, **kwargs)¶
Overloaded function.
resize_image(img: numpy.ndarray[(rows,cols),uint8], rows: int, cols: int) -> numpy.ndarray[(rows,cols),uint8]
resize_image(img: numpy.ndarray[(rows,cols),uint16], rows: int, cols: int) -> numpy.ndarray[(rows,cols),uint16]
resize_image(img: numpy.ndarray[(rows,cols),uint32], rows: int, cols: int) -> numpy.ndarray[(rows,cols),uint32]
resize_image(img: numpy.ndarray[(rows,cols),uint64], rows: int, cols: int) -> numpy.ndarray[(rows,cols),uint64]
resize_image(img: numpy.ndarray[(rows,cols),int8], rows: int, cols: int) -> numpy.ndarray[(rows,cols),int8]
resize_image(img: numpy.ndarray[(rows,cols),int16], rows: int, cols: int) -> numpy.ndarray[(rows,cols),int16]
resize_image(img: numpy.ndarray[(rows,cols),int32], rows: int, cols: int) -> numpy.ndarray[(rows,cols),int32]
resize_image(img: numpy.ndarray[(rows,cols),int64], rows: int, cols: int) -> numpy.ndarray[(rows,cols),int64]
resize_image(img: numpy.ndarray[(rows,cols),float32], rows: int, cols: int) -> numpy.ndarray[(rows,cols),float32]
resize_image(img: numpy.ndarray[(rows,cols),float64], rows: int, cols: int) -> numpy.ndarray[(rows,cols),float64]
Resizes img, using bilinear interpolation, to have the indicated number of rows and columns.
resize_image(img: numpy.ndarray[(rows,cols,3),uint8], rows: int, cols: int) -> numpy.ndarray[(rows,cols,3),uint8]
Resizes img, using bilinear interpolation, to have the indicated number of rows and columns.
resize_image(img: numpy.ndarray[(rows,cols),int8], scale: float) -> numpy.ndarray[(rows,cols),int8]
resize_image(img: numpy.ndarray[(rows,cols),int16], scale: float) -> numpy.ndarray[(rows,cols),int16]
resize_image(img: numpy.ndarray[(rows,cols),int32], scale: float) -> numpy.ndarray[(rows,cols),int32]
resize_image(img: numpy.ndarray[(rows,cols),int64], scale: float) -> numpy.ndarray[(rows,cols),int64]
resize_image(img: numpy.ndarray[(rows,cols),float32], scale: float) -> numpy.ndarray[(rows,cols),float32]
resize_image(img: numpy.ndarray[(rows,cols),float64], scale: float) -> numpy.ndarray[(rows,cols),float64]
resize_image(img: numpy.ndarray[(rows,cols,3),uint8], scale: float) -> numpy.ndarray[(rows,cols,3),uint8]
Resizes img, using bilinear interpolation, to have the new size (img rows * scale, img cols * scale)
- dlib.reverse(l: dlib.line) dlib.line ¶
- ensures
returns line(l.p2, l.p1) (i.e. returns a line object that represents the same line as l but with the endpoints, and therefore, the normal vector flipped. This means that the signed distance of operator() is also flipped).
- class dlib.rgb_pixel¶
- __init__(self: dlib.rgb_pixel, red: int, green: int, blue: int) None ¶
- property blue¶
- property green¶
- property red¶
- class dlib.rvm_trainer_histogram_intersection¶
- __init__(self: dlib.rvm_trainer_histogram_intersection) None ¶
- property epsilon¶
- train(self: dlib.rvm_trainer_histogram_intersection, arg0: dlib.vectors, arg1: dlib.array) dlib._decision_function_histogram_intersection ¶
- class dlib.rvm_trainer_linear¶
- __init__(self: dlib.rvm_trainer_linear) None ¶
- property epsilon¶
- train(self: dlib.rvm_trainer_linear, arg0: dlib.vectors, arg1: dlib.array) dlib._decision_function_linear ¶
- class dlib.rvm_trainer_radial_basis¶
- __init__(self: dlib.rvm_trainer_radial_basis) None ¶
- property epsilon¶
- property gamma¶
- train(self: dlib.rvm_trainer_radial_basis, arg0: dlib.vectors, arg1: dlib.array) dlib._decision_function_radial_basis ¶
- class dlib.rvm_trainer_sparse_histogram_intersection¶
- __init__(self: dlib.rvm_trainer_sparse_histogram_intersection) None ¶
- property epsilon¶
- train(self: dlib.rvm_trainer_sparse_histogram_intersection, arg0: dlib.sparse_vectors, arg1: dlib.array) dlib._decision_function_sparse_histogram_intersection ¶
- class dlib.rvm_trainer_sparse_linear¶
- __init__(self: dlib.rvm_trainer_sparse_linear) None ¶
- property epsilon¶
- train(self: dlib.rvm_trainer_sparse_linear, arg0: dlib.sparse_vectors, arg1: dlib.array) dlib._decision_function_sparse_linear ¶
- class dlib.rvm_trainer_sparse_radial_basis¶
- __init__(self: dlib.rvm_trainer_sparse_radial_basis) None ¶
- property epsilon¶
- property gamma¶
- train(self: dlib.rvm_trainer_sparse_radial_basis, arg0: dlib.sparse_vectors, arg1: dlib.array) dlib._decision_function_sparse_radial_basis ¶
- dlib.save_face_chip(img: numpy.ndarray[rows, cols, 3, uint8], face: dlib.full_object_detection, chip_filename: str, size: int = 150, padding: float = 0.25) None ¶
Takes an image and a full_object_detection that references a face in that image and saves the face with the specified file name prefix. The face will be rotated upright and scaled to 150x150 pixels or with the optional specified size and padding.
- dlib.save_face_chips(img: numpy.ndarray[rows, cols, 3, uint8], faces: dlib.full_object_detections, chip_filename: str, size: int = 150, padding: float = 0.25) None ¶
Takes an image and a full_object_detections object that reference faces in that image and saves the faces with the specified file name prefix. The faces will be rotated upright and scaled to 150x150 pixels or with the optional specified size and padding.
- dlib.save_image(*args, **kwargs)¶
Overloaded function.
save_image(img: numpy.ndarray[(rows,cols,3),uint8], filename: str) -> None
Saves the given image to the specified path. Determines the file type from the file extension specified in the path
save_image(img: numpy.ndarray[(rows,cols),uint8], filename: str) -> None
Saves the given image to the specified path. Determines the file type from the file extension specified in the path
- dlib.save_libsvm_formatted_data(file_name: str, samples: dlib.sparse_vectors, labels: dlib.array) None ¶
- requires
len(samples) == len(labels)
- ensures
saves the data to the given file in libsvm format
- dlib.scale_rect(rect: dlib.rectangle, scale: float) dlib.rectangle ¶
return scale_rect(rect, scale)
(i.e. resizes the given rectangle by a scale factor)
- class dlib.segmenter_params¶
This class is used to define all the optional parameters to the train_sequence_segmenter() and cross_validate_sequence_segmenter() routines.
- property C¶
SVM C parameter
- __init__(self: dlib.segmenter_params) None ¶
- property allow_negative_weights¶
- property be_verbose¶
- property epsilon¶
- property max_cache_size¶
- property num_threads¶
- property use_BIO_model¶
- property use_high_order_features¶
- property window_size¶
- class dlib.segmenter_test¶
This object is the output of the dlib.test_sequence_segmenter() and dlib.cross_validate_sequence_segmenter() routines.
- __init__(*args, **kwargs)¶
- property f1¶
- property precision¶
- property recall¶
- class dlib.segmenter_type¶
This object represents a sequence segmenter and is the type of object returned by the dlib.train_sequence_segmenter() routine.
- __call__(*args, **kwargs)¶
Overloaded function.
__call__(self: dlib.segmenter_type, arg0: dlib.vectors) -> dlib.ranges
__call__(self: dlib.segmenter_type, arg0: dlib.sparse_vectors) -> dlib.ranges
- __init__(*args, **kwargs)¶
- property weights¶
- dlib.set_dnn_prefer_smallest_algorithms() None ¶
Tells cuDNN to use slower algorithms that use less RAM.
- class dlib.shape_predictor¶
This object is a tool that takes in an image region containing some object and outputs a set of point locations that define the pose of the object. The classic example of this is human face pose prediction, where you take an image of a human face as input and are expected to identify the locations of important facial landmarks such as the corners of the mouth and eyes, tip of the nose, and so forth.
- __call__(self: dlib.shape_predictor, image: array, box: dlib.rectangle) dlib.full_object_detection ¶
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
box is the bounding box to begin the shape prediction inside.
- ensures
This function runs the shape predictor on the input image and returns a single full_object_detection.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.shape_predictor) -> None
__init__(self: dlib.shape_predictor, arg0: str) -> None
Loads a shape_predictor from a file that contains the output of the train_shape_predictor() routine.
- save(self: dlib.shape_predictor, predictor_output_filename: str) None ¶
Save a shape_predictor to the provided path.
- class dlib.shape_predictor_training_options¶
This object is a container for the options to the train_shape_predictor() routine.
- __init__(self: dlib.shape_predictor_training_options) None ¶
- property be_verbose¶
If true, train_shape_predictor() will print out a lot of information to stdout while training.
- property cascade_depth¶
The number of cascades created to train the model with.
- property feature_pool_region_padding¶
Size of region within which to sample features for the feature pool. positive values increase the sampling region while negative values decrease it. E.g. padding of 0 means we sample fr
- property feature_pool_size¶
Number of pixels used to generate features for the random trees.
- property lambda_param¶
Controls how tight the feature sampling should be. Lower values enforce closer features.
- property landmark_relative_padding_mode¶
If True then features are drawn only from the box around the landmarks, otherwise they come from the bounding box and landmarks together. See feature_pool_region_padding doc for more details.
- property nu¶
The regularization parameter. Larger values of this parameter will cause the algorithm to fit the training data better but may also cause overfitting. The value must be in the range (0, 1].
- property num_test_splits¶
Number of split features at each node to sample. The one that gives the best split is chosen.
- property num_threads¶
Use this many threads/CPU cores for training.
- property num_trees_per_cascade_level¶
The number of trees created for each cascade.
- property oversampling_amount¶
The number of randomly selected initial starting points sampled for each training example
- property oversampling_translation_jitter¶
The amount of translation jittering to apply to bounding boxes, a good value is in in the range [0 0.5].
- property random_seed¶
The random seed used by the internal random number generator
- property tree_depth¶
The depth of the trees used in each cascade. There are pow(2, get_tree_depth()) leaves in each tree
- dlib.shrink_rect(rect: dlib.rectangle, num: int) dlib.rectangle ¶
- returns rectangle(rect.left()+num, rect.top()+num, rect.right()-num, rect.bottom()-num)
(i.e. shrinks the given rectangle by shrinking its border by num)
- dlib.signed_distance_to_line(*args, **kwargs)¶
Overloaded function.
signed_distance_to_line(l: dlib.line, p: dlib.point) -> float
signed_distance_to_line(l: dlib.line, p: dlib.dpoint) -> float
- ensures
returns how far p is from the line l. This is a signed distance. The sign indicates which side of the line the point is on and the magnitude is the distance. Moreover, the direction of positive sign is pointed to by the vector l.normal.
To be specific, this routine returns dot(p-l.p1, l.normal)
- class dlib.simple_object_detector¶
This object represents a sliding window histogram-of-oriented-gradients based object detector.
- __call__(*args, **kwargs)¶
Overloaded function.
__call__(self: dlib.simple_object_detector, image: array, upsample_num_times: int) -> dlib.rectangles
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
upsample_num_times >= 0
- ensures
This function runs the object detector on the input image and returns a list of detections.
Upsamples the image upsample_num_times before running the basic detector. If you don’t know how many times you want to upsample then don’t provide a value for upsample_num_times and an appropriate default will be used.
__call__(self: dlib.simple_object_detector, image: array) -> dlib.rectangles
- requires
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
- ensures
This function runs the object detector on the input image and returns a list of detections.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.simple_object_detector, detectors: list) -> None
This version of the constructor builds a simple_object_detector from a bunch of other simple_object_detectors. It essentially packs them together so that when you run the detector it’s like calling run_multiple(). Except in this case the non-max suppression is applied to them all as a group. So unlike run_multiple(), each detector competes in the non-max suppression.
Also, the non-max suppression settings used for this whole thing are the settings used by detectors[0]. So if you have a preference, put the detector that uses the type of non-max suppression you like first in the list.
__init__(self: dlib.simple_object_detector, arg0: str) -> None
Loads a simple_object_detector from a file that contains the output of the train_simple_object_detector() routine.
- property detection_window_height¶
- property detection_window_width¶
- property num_detectors¶
- run_multiple(detectors: list, image: array, upsample_num_times: int = 0, adjust_threshold: float = 0.0) tuple ¶
- requires
detectors is a list of detectors.
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
upsample_num_times >= 0
- ensures
This function runs the list of object detectors at once on the input image and returns a tuple of (list of detections, list of scores, list of weight_indices).
Upsamples the image upsample_num_times before running the basic detector.
- save(self: dlib.simple_object_detector, detector_output_filename: str) None ¶
Save a simple_object_detector to the provided path.
- property upsampling_amount¶
The detector upsamples the image this many times before running.
- class dlib.simple_object_detector_training_options¶
This object is a container for the options to the train_simple_object_detector() routine.
- property C¶
C is the usual SVM C regularization parameter. So it is passed to structural_object_detection_trainer::set_c(). Larger values of C will encourage the trainer to fit the data better but might lead to overfitting. Therefore, you must determine the proper setting of this parameter experimentally.
- __init__(self: dlib.simple_object_detector_training_options) None ¶
- property add_left_right_image_flips¶
if true, train_simple_object_detector() will assume the objects are left/right symmetric and add in left right flips of the training images. This doubles the size of the training dataset.
- property be_verbose¶
If true, train_simple_object_detector() will print out a lot of information to the screen while training.
- property detection_window_size¶
The sliding window used will have about this many pixels inside it.
- property epsilon¶
epsilon is the stopping epsilon. Smaller values make the trainer’s solver more accurate but might take longer to train.
- property max_runtime_seconds¶
Don’t let the solver run for longer than this many seconds.
- property nuclear_norm_regularization_strength¶
This detector works by convolving a filter over a HOG feature image. If that filter is separable then the convolution can be performed much faster. The nuclear_norm_regularization_strength parameter encourages the machine learning algorithm to learn a separable filter. A value of 0 disables this feature, but any non-zero value places a nuclear norm regularizer on the objective function and this encourages the learning of a separable filter. Note that setting nuclear_norm_regularization_strength to a non-zero value can make the training process take significantly longer, so be patient when using it.
- property num_threads¶
train_simple_object_detector() will use this many threads of execution. Set this to the number of CPU cores on your machine to obtain the fastest training speed.
- property upsample_limit¶
train_simple_object_detector() will upsample images if needed no more than upsample_limit times. Value 0 will forbid trainer to upsample any images. If trainer is unable to fit all boxes with required upsample_limit, exception will be thrown. Higher values of upsample_limit exponentially increases memory requirements. Values higher than 2 (default) are not recommended.
- class dlib.simple_test_results¶
- __init__(*args, **kwargs)¶
- property average_precision¶
- property precision¶
- property recall¶
- dlib.skeleton(img: numpy.ndarray[rows, cols, uint8]) numpy.ndarray[rows, cols, uint8] ¶
- requires
all pixels in img are set to either 255 or 0.
- ensures
This function computes the skeletonization of img and stores the result in #img. That is, given a binary image, we progressively thin the binary blobs (composed of on_pixel values) until only a single pixel wide skeleton of the original blobs remains.
Doesn’t change the shape or size of img.
- dlib.sobel_edge_detector(*args, **kwargs)¶
Overloaded function.
sobel_edge_detector(img: numpy.ndarray[(rows,cols),uint8]) -> tuple
sobel_edge_detector(img: numpy.ndarray[(rows,cols),uint16]) -> tuple
sobel_edge_detector(img: numpy.ndarray[(rows,cols),uint32]) -> tuple
sobel_edge_detector(img: numpy.ndarray[(rows,cols),uint64]) -> tuple
sobel_edge_detector(img: numpy.ndarray[(rows,cols),int8]) -> tuple
sobel_edge_detector(img: numpy.ndarray[(rows,cols),int16]) -> tuple
sobel_edge_detector(img: numpy.ndarray[(rows,cols),int32]) -> tuple
sobel_edge_detector(img: numpy.ndarray[(rows,cols),int64]) -> tuple
sobel_edge_detector(img: numpy.ndarray[(rows,cols),float32]) -> tuple
sobel_edge_detector(img: numpy.ndarray[(rows,cols),float64]) -> tuple
Applies the sobel edge detector to the given input image and returns two gradient images in a tuple. The first contains the x gradients and the second contains the y gradients of the image.
- dlib.solve_structural_svm_problem(problem: object) dlib.vector ¶
This function solves a structural SVM problem and returns the weight vector that defines the solution. See the example program python_examples/svm_struct.py for documentation about how to create a proper problem object.
- class dlib.sparse_ranking_pair¶
- __init__(self: dlib.sparse_ranking_pair) None ¶
- property nonrelevant¶
- property relevant¶
- class dlib.sparse_ranking_pairs¶
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.sparse_ranking_pairs) -> None
__init__(self: dlib.sparse_ranking_pairs, arg0: dlib.sparse_ranking_pairs) -> None
Copy constructor
__init__(self: dlib.sparse_ranking_pairs, arg0: iterable) -> None
- append(self: dlib.sparse_ranking_pairs, x: dlib.sparse_ranking_pair) None ¶
Add an item to the end of the list
- clear(self: dlib.sparse_ranking_pairs) None ¶
- count(self: dlib.sparse_ranking_pairs, x: dlib.sparse_ranking_pair) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.sparse_ranking_pairs, L: dlib.sparse_ranking_pairs) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.sparse_ranking_pairs, arg0: list) -> None
- insert(self: dlib.sparse_ranking_pairs, i: int, x: dlib.sparse_ranking_pair) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.sparse_ranking_pairs) -> dlib.sparse_ranking_pair
Remove and return the last item
pop(self: dlib.sparse_ranking_pairs, i: int) -> dlib.sparse_ranking_pair
Remove and return the item at index
i
- remove(self: dlib.sparse_ranking_pairs, x: dlib.sparse_ranking_pair) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.sparse_ranking_pairs, arg0: int) None ¶
- class dlib.sparse_vector¶
This object represents the mathematical idea of a sparse column vector. It is simply an array of dlib.pair objects, each representing an index/value pair in the vector. Any elements of the vector which are missing are implicitly set to zero.
Unless otherwise noted, any routines taking a sparse_vector assume the sparse vector is sorted and has unique elements. That is, the index values of the pairs in a sparse_vector should be listed in increasing order and there should not be duplicates. However, some functions work with “unsorted” sparse vectors. These are dlib.sparse_vector objects that have either duplicate entries or non-sorted index values. Note further that you can convert an “unsorted” sparse_vector into a properly sorted sparse vector by calling dlib.make_sparse_vector() on it.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.sparse_vector) -> None
__init__(self: dlib.sparse_vector, arg0: dlib.sparse_vector) -> None
Copy constructor
__init__(self: dlib.sparse_vector, arg0: iterable) -> None
- append(self: dlib.sparse_vector, x: dlib.pair) None ¶
Add an item to the end of the list
- clear(self: dlib.sparse_vector) None ¶
- count(self: dlib.sparse_vector, x: dlib.pair) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.sparse_vector, L: dlib.sparse_vector) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.sparse_vector, arg0: list) -> None
- insert(self: dlib.sparse_vector, i: int, x: dlib.pair) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.sparse_vector) -> dlib.pair
Remove and return the last item
pop(self: dlib.sparse_vector, i: int) -> dlib.pair
Remove and return the item at index
i
- remove(self: dlib.sparse_vector, x: dlib.pair) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.sparse_vector, arg0: int) None ¶
- class dlib.sparse_vectors¶
This object is an array of sparse_vector objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.sparse_vectors) -> None
__init__(self: dlib.sparse_vectors, arg0: dlib.sparse_vectors) -> None
Copy constructor
__init__(self: dlib.sparse_vectors, arg0: iterable) -> None
- append(self: dlib.sparse_vectors, x: dlib.sparse_vector) None ¶
Add an item to the end of the list
- clear(self: dlib.sparse_vectors) None ¶
- count(self: dlib.sparse_vectors, x: dlib.sparse_vector) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.sparse_vectors, L: dlib.sparse_vectors) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.sparse_vectors, arg0: list) -> None
- insert(self: dlib.sparse_vectors, i: int, x: dlib.sparse_vector) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.sparse_vectors) -> dlib.sparse_vector
Remove and return the last item
pop(self: dlib.sparse_vectors, i: int) -> dlib.sparse_vector
Remove and return the item at index
i
- remove(self: dlib.sparse_vectors, x: dlib.sparse_vector) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.sparse_vectors, arg0: int) None ¶
- class dlib.sparse_vectorss¶
This object is an array of arrays of sparse_vector objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.sparse_vectorss) -> None
__init__(self: dlib.sparse_vectorss, arg0: dlib.sparse_vectorss) -> None
Copy constructor
__init__(self: dlib.sparse_vectorss, arg0: iterable) -> None
- append(self: dlib.sparse_vectorss, x: dlib.sparse_vectors) None ¶
Add an item to the end of the list
- clear(self: dlib.sparse_vectorss) None ¶
- count(self: dlib.sparse_vectorss, x: dlib.sparse_vectors) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.sparse_vectorss, L: dlib.sparse_vectorss) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.sparse_vectorss, arg0: list) -> None
- insert(self: dlib.sparse_vectorss, i: int, x: dlib.sparse_vectors) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.sparse_vectorss) -> dlib.sparse_vectors
Remove and return the last item
pop(self: dlib.sparse_vectorss, i: int) -> dlib.sparse_vectors
Remove and return the item at index
i
- remove(self: dlib.sparse_vectorss, x: dlib.sparse_vectors) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.sparse_vectorss, arg0: int) None ¶
- dlib.spatially_filter_image(*args, **kwargs)¶
Overloaded function.
spatially_filter_image(img: numpy.ndarray[(rows,cols),uint8], filter: numpy.ndarray[(rows,cols),uint8]) -> tuple
spatially_filter_image(img: numpy.ndarray[(rows,cols),float32], filter: numpy.ndarray[(rows,cols),float32]) -> tuple
spatially_filter_image(img: numpy.ndarray[(rows,cols),float64], filter: numpy.ndarray[(rows,cols),float64]) -> tuple
- requires
filter.size != 0
- ensures
Applies the given spatial filter to img and returns the result (i.e. we cross-correlate img with filter). We also return a rectangle which indicates what pixels in the returned image are considered non-border pixels and therefore contain output from the filter. E.g.
filtered_img,rect = spatially_filter_image(img, filter)
would give you the filtered image and the rectangle in question. Since the returned image has the same shape as img we fill the border pixels by setting them to 0.
The filter is applied such that it’s centered over the pixel it writes its output into. For centering purposes, we consider the center element of the filter to be filter[filter.shape[0]/2,filter.shape[1]/2]. This means that the filter that writes its output to a pixel at location point(c,r) and is W by H (width by height) pixels in size operates on exactly the pixels in the rectangle centered_rect(point(c,r),W,H) within img.
- dlib.spatially_filter_image_separable(*args, **kwargs)¶
Overloaded function.
spatially_filter_image_separable(img: numpy.ndarray[(rows,cols),uint8], row_filter: numpy.ndarray[uint8], col_filter: numpy.ndarray[uint8]) -> tuple
spatially_filter_image_separable(img: numpy.ndarray[(rows,cols),float32], row_filter: numpy.ndarray[float32], col_filter: numpy.ndarray[float32]) -> tuple
spatially_filter_image_separable(img: numpy.ndarray[(rows,cols),float64], row_filter: numpy.ndarray[float64], col_filter: numpy.ndarray[float64]) -> tuple
- requires
row_filter.size != 0
col_filter.size != 0
row_filter and col_filter are both either row or column vectors.
- ensures
Applies the given separable spatial filter to img and returns the result (i.e. we cross-correlate img with the filters). In particular, calling this function has the same effect as calling the regular spatially_filter_image() routine with a filter, FILT, defined as follows:
FILT(r,c) == col_filter(r)*row_filter(c)
Therefore, the return value of this routine is the same as if it were implemented as:
return spatially_filter_image(img, FILT)
Except that this version should be faster for separable filters.
- dlib.sub_image(*args, **kwargs)¶
Overloaded function.
sub_image(img: array, rect: dlib.rectangle) -> array
Returns a new numpy array that references the sub window in img defined by rect. If rect is larger than img then rect is cropped so that it does not go outside img. Therefore, this routine is equivalent to performing:
win = get_rect(img).intersect(rect) subimg = img[win.top():win.bottom()-1,win.left():win.right()-1]
sub_image(image_and_rect_tuple: tuple) -> array
Performs: return sub_image(image_and_rect_tuple[0], image_and_rect_tuple[1])
- dlib.suppress_non_maximum_edges(*args, **kwargs)¶
Overloaded function.
suppress_non_maximum_edges(horz: numpy.ndarray[(rows,cols),float32], vert: numpy.ndarray[(rows,cols),float32]) -> numpy.ndarray[(rows,cols),float32]
- requires
The two input images have the same dimensions.
- ensures
Returns an image, of the same dimensions as the input. Each element in this image holds the edge strength at that location. Moreover, edge pixels that are not local maximizers have been set to 0.
let edge_strength(r,c) == sqrt(pow(horz[r][c],2) + pow(vert[r][c],2)) (i.e. The Euclidean norm of the gradient)
let OUT denote the returned image.
- for all valid r and c:
if (edge_strength(r,c) is at a maximum with respect to its 2 neighboring pixels along the line indicated by the image gradient vector (horz[r][c],vert[r][c])) then
OUT[r][c] == edge_strength(r,c)
- else
OUT[r][c] == 0
suppress_non_maximum_edges(horz_and_vert_gradients: tuple) -> numpy.ndarray[(rows,cols),float32]
Performs: return suppress_non_maximum_edges(horz_and_vert_gradients[0], horz_and_vert_gradients[1])
- class dlib.svm_c_trainer_histogram_intersection¶
- __init__(self: dlib.svm_c_trainer_histogram_intersection) None ¶
- property c_class1¶
- property c_class2¶
- property cache_size¶
- property epsilon¶
- set_c(self: dlib.svm_c_trainer_histogram_intersection, arg0: float) None ¶
- train(self: dlib.svm_c_trainer_histogram_intersection, arg0: dlib.vectors, arg1: dlib.array) dlib._decision_function_histogram_intersection ¶
- class dlib.svm_c_trainer_linear¶
- __init__(self: dlib.svm_c_trainer_linear) None ¶
- be_quiet(self: dlib.svm_c_trainer_linear) None ¶
- be_verbose(self: dlib.svm_c_trainer_linear) None ¶
- property c_class1¶
- property c_class2¶
- property epsilon¶
- property force_last_weight_to_1¶
- property has_prior¶
- property learns_nonnegative_weights¶
- property max_iterations¶
- set_c(self: dlib.svm_c_trainer_linear, arg0: float) None ¶
- set_prior(self: dlib.svm_c_trainer_linear, arg0: dlib._decision_function_linear) None ¶
- train(self: dlib.svm_c_trainer_linear, arg0: dlib.vectors, arg1: dlib.array) dlib._decision_function_linear ¶
- class dlib.svm_c_trainer_radial_basis¶
- __init__(self: dlib.svm_c_trainer_radial_basis) None ¶
- property c_class1¶
- property c_class2¶
- property cache_size¶
- property epsilon¶
- property gamma¶
- set_c(self: dlib.svm_c_trainer_radial_basis, arg0: float) None ¶
- train(self: dlib.svm_c_trainer_radial_basis, arg0: dlib.vectors, arg1: dlib.array) dlib._decision_function_radial_basis ¶
- class dlib.svm_c_trainer_sparse_histogram_intersection¶
- __init__(self: dlib.svm_c_trainer_sparse_histogram_intersection) None ¶
- property c_class1¶
- property c_class2¶
- property cache_size¶
- property epsilon¶
- set_c(self: dlib.svm_c_trainer_sparse_histogram_intersection, arg0: float) None ¶
- train(self: dlib.svm_c_trainer_sparse_histogram_intersection, arg0: dlib.sparse_vectors, arg1: dlib.array) dlib._decision_function_sparse_histogram_intersection ¶
- class dlib.svm_c_trainer_sparse_linear¶
- __init__(self: dlib.svm_c_trainer_sparse_linear) None ¶
- be_quiet(self: dlib.svm_c_trainer_sparse_linear) None ¶
- be_verbose(self: dlib.svm_c_trainer_sparse_linear) None ¶
- property c_class1¶
- property c_class2¶
- property epsilon¶
- property force_last_weight_to_1¶
- property has_prior¶
- property learns_nonnegative_weights¶
- property max_iterations¶
- set_c(self: dlib.svm_c_trainer_sparse_linear, arg0: float) None ¶
- set_prior(self: dlib.svm_c_trainer_sparse_linear, arg0: dlib._decision_function_sparse_linear) None ¶
- train(self: dlib.svm_c_trainer_sparse_linear, arg0: dlib.sparse_vectors, arg1: dlib.array) dlib._decision_function_sparse_linear ¶
- class dlib.svm_c_trainer_sparse_radial_basis¶
- __init__(self: dlib.svm_c_trainer_sparse_radial_basis) None ¶
- property c_class1¶
- property c_class2¶
- property cache_size¶
- property epsilon¶
- property gamma¶
- set_c(self: dlib.svm_c_trainer_sparse_radial_basis, arg0: float) None ¶
- train(self: dlib.svm_c_trainer_sparse_radial_basis, arg0: dlib.sparse_vectors, arg1: dlib.array) dlib._decision_function_sparse_radial_basis ¶
- class dlib.svm_rank_trainer¶
- __init__(self: dlib.svm_rank_trainer) None ¶
- be_quiet(self: dlib.svm_rank_trainer) None ¶
- be_verbose(self: dlib.svm_rank_trainer) None ¶
- property c¶
- property epsilon¶
- property force_last_weight_to_1¶
- property has_prior¶
- property learns_nonnegative_weights¶
- property max_iterations¶
- set_prior(self: dlib.svm_rank_trainer, arg0: dlib::decision_function<dlib::linear_kernel<dlib::matrix<double, 0l, 1l, dlib::memory_manager_stateless_kernel_1<char>, dlib::row_major_layout> > >) None ¶
- train(*args, **kwargs)¶
Overloaded function.
train(self: dlib.svm_rank_trainer, arg0: dlib.ranking_pair) -> dlib::decision_function<dlib::linear_kernel<dlib::matrix<double, 0l, 1l, dlib::memory_manager_stateless_kernel_1<char>, dlib::row_major_layout> > >
train(self: dlib.svm_rank_trainer, arg0: dlib.ranking_pairs) -> dlib::decision_function<dlib::linear_kernel<dlib::matrix<double, 0l, 1l, dlib::memory_manager_stateless_kernel_1<char>, dlib::row_major_layout> > >
- class dlib.svm_rank_trainer_sparse¶
- __init__(self: dlib.svm_rank_trainer_sparse) None ¶
- be_quiet(self: dlib.svm_rank_trainer_sparse) None ¶
- be_verbose(self: dlib.svm_rank_trainer_sparse) None ¶
- property c¶
- property epsilon¶
- property force_last_weight_to_1¶
- property has_prior¶
- property learns_nonnegative_weights¶
- property max_iterations¶
- set_prior(self: dlib.svm_rank_trainer_sparse, arg0: dlib::decision_function<dlib::sparse_linear_kernel<std::vector<std::pair<unsigned long, double>, std::allocator<std::pair<unsigned long, double> > > > >) None ¶
- train(*args, **kwargs)¶
Overloaded function.
train(self: dlib.svm_rank_trainer_sparse, arg0: dlib.sparse_ranking_pair) -> dlib::decision_function<dlib::sparse_linear_kernel<std::vector<std::pair<unsigned long, double>, std::allocator<std::pair<unsigned long, double> > > > >
train(self: dlib.svm_rank_trainer_sparse, arg0: dlib.sparse_ranking_pairs) -> dlib::decision_function<dlib::sparse_linear_kernel<std::vector<std::pair<unsigned long, double>, std::allocator<std::pair<unsigned long, double> > > > >
- dlib.test_binary_decision_function(*args, **kwargs)¶
Overloaded function.
test_binary_decision_function(function: dlib._normalized_decision_function_radial_basis, samples: dlib.vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._normalized_decision_function_radial_basis, samples: numpy.ndarray[(rows,cols),float64], labels: numpy.ndarray[float64]) -> binary_test
test_binary_decision_function(function: dlib._decision_function_linear, samples: dlib.vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._decision_function_sparse_linear, samples: dlib.sparse_vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._decision_function_radial_basis, samples: dlib.vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._decision_function_sparse_radial_basis, samples: dlib.sparse_vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._decision_function_polynomial, samples: dlib.vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._decision_function_sparse_polynomial, samples: dlib.sparse_vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._decision_function_histogram_intersection, samples: dlib.vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._decision_function_sparse_histogram_intersection, samples: dlib.sparse_vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._decision_function_sigmoid, samples: dlib.vectors, labels: dlib.array) -> binary_test
test_binary_decision_function(function: dlib._decision_function_sparse_sigmoid, samples: dlib.sparse_vectors, labels: dlib.array) -> binary_test
- dlib.test_ranking_function(*args, **kwargs)¶
Overloaded function.
test_ranking_function(function: dlib._decision_function_linear, samples: dlib.ranking_pairs) -> ranking_test
test_ranking_function(function: dlib._decision_function_sparse_linear, samples: dlib.sparse_ranking_pairs) -> ranking_test
test_ranking_function(function: dlib._decision_function_linear, sample: dlib.ranking_pair) -> ranking_test
test_ranking_function(function: dlib._decision_function_sparse_linear, sample: dlib.sparse_ranking_pair) -> ranking_test
- dlib.test_regression_function(*args, **kwargs)¶
Overloaded function.
test_regression_function(function: dlib._decision_function_linear, samples: dlib.vectors, targets: dlib.array) -> regression_test
test_regression_function(function: dlib._decision_function_sparse_linear, samples: dlib.sparse_vectors, targets: dlib.array) -> regression_test
test_regression_function(function: dlib._decision_function_radial_basis, samples: dlib.vectors, targets: dlib.array) -> regression_test
test_regression_function(function: dlib._decision_function_sparse_radial_basis, samples: dlib.sparse_vectors, targets: dlib.array) -> regression_test
test_regression_function(function: dlib._decision_function_histogram_intersection, samples: dlib.vectors, targets: dlib.array) -> regression_test
test_regression_function(function: dlib._decision_function_sparse_histogram_intersection, samples: dlib.sparse_vectors, targets: dlib.array) -> regression_test
test_regression_function(function: dlib._decision_function_sigmoid, samples: dlib.vectors, targets: dlib.array) -> regression_test
test_regression_function(function: dlib._decision_function_sparse_sigmoid, samples: dlib.sparse_vectors, targets: dlib.array) -> regression_test
test_regression_function(function: dlib._decision_function_polynomial, samples: dlib.vectors, targets: dlib.array) -> regression_test
test_regression_function(function: dlib._decision_function_sparse_polynomial, samples: dlib.sparse_vectors, targets: dlib.array) -> regression_test
- dlib.test_sequence_segmenter(*args, **kwargs)¶
Overloaded function.
test_sequence_segmenter(arg0: dlib.segmenter_type, arg1: dlib.vectorss, arg2: dlib.rangess) -> dlib.segmenter_test
test_sequence_segmenter(arg0: dlib.segmenter_type, arg1: dlib.sparse_vectorss, arg2: dlib.rangess) -> dlib.segmenter_test
- dlib.test_shape_predictor(*args, **kwargs)¶
Overloaded function.
test_shape_predictor(dataset_filename: str, predictor_filename: str) -> float
- ensures
Loads an image dataset from dataset_filename. We assume dataset_filename is a file using the XML format written by save_image_dataset_metadata().
Loads a shape_predictor from the file predictor_filename. This means predictor_filename should be a file produced by the train_shape_predictor() routine.
This function tests the predictor against the dataset and returns the mean average error of the detector. In fact, The return value of this function is identical to that of dlib’s shape_predictor_trainer() routine. Therefore, see the documentation for shape_predictor_trainer() for a detailed definition of the mean average error.
test_shape_predictor(images: list, detections: list, shape_predictor: dlib.shape_predictor) -> float
- requires
len(images) == len(object_detections)
images should be a list of numpy matrices that represent images, either RGB or grayscale.
object_detections should be a list of lists of dlib.full_object_detection objects. Each dlib.full_object_detection contains the bounding box and the lists of points that make up the object parts.
- ensures
shape_predictor should be a file produced by the train_shape_predictor() routine.
This function tests the predictor against the dataset and returns the mean average error of the detector. In fact, The return value of this function is identical to that of dlib’s shape_predictor_trainer() routine. Therefore, see the documentation for shape_predictor_trainer() for a detailed definition of the mean average error.
test_shape_predictor(images: list, detections: list, scales: list, shape_predictor: dlib.shape_predictor) -> float
- requires
len(images) == len(object_detections)
len(object_detections) == len(scales)
for every sublist in object_detections: len(object_detections[i]) == len(scales[i])
scales is a list of floating point scales that each predicted part location should be divided by. Useful for normalization.
images should be a list of numpy matrices that represent images, either RGB or grayscale.
object_detections should be a list of lists of dlib.full_object_detection objects. Each dlib.full_object_detection contains the bounding box and the lists of points that make up the object parts.
- ensures
shape_predictor should be a file produced by the train_shape_predictor() routine.
This function tests the predictor against the dataset and returns the mean average error of the detector. In fact, The return value of this function is identical to that of dlib’s shape_predictor_trainer() routine. Therefore, see the documentation for shape_predictor_trainer() for a detailed definition of the mean average error.
- dlib.test_simple_object_detector(*args, **kwargs)¶
Overloaded function.
test_simple_object_detector(dataset_filename: str, detector_filename: str, upsampling_amount: int=-1) -> dlib.simple_test_results
- ensures
Loads an image dataset from dataset_filename. We assume dataset_filename is a file using the XML format written by save_image_dataset_metadata().
Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.
if upsampling_amount>=0 then we upsample the data by upsampling_amount rather than use any upsampling amount that happens to be encoded in the given detector. If upsampling_amount<0 then we use the upsampling amount the detector wants to use.
test_simple_object_detector(dataset_filename: str, detector: dlib::simple_object_detector_py, upsampling_amount: int=-1) -> dlib.simple_test_results
- ensures
Loads an image dataset from dataset_filename. We assume dataset_filename is a file using the XML format written by save_image_dataset_metadata().
Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.
if upsampling_amount>=0 then we upsample the data by upsampling_amount rather than use any upsampling amount that happens to be encoded in the given detector. If upsampling_amount<0 then we use the upsampling amount the detector wants to use.
test_simple_object_detector(images: list, boxes: list, detector: dlib::object_detector<dlib::scan_fhog_pyramid<dlib::pyramid_down<6u>, dlib::default_fhog_feature_extractor> >, upsampling_amount: int=0) -> dlib.simple_test_results
- requires
len(images) == len(boxes)
images should be a list of numpy matrices that represent images, either RGB or grayscale.
boxes should be a list of lists of dlib.rectangle object.
Optionally, take the number of times to upsample the testing images (upsampling_amount >= 0).
- ensures
Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.
test_simple_object_detector(images: list, boxes: list, detector: dlib::simple_object_detector_py, upsampling_amount: int=-1) -> dlib.simple_test_results
- requires
len(images) == len(boxes)
images should be a list of numpy matrices that represent images, either RGB or grayscale.
boxes should be a list of lists of dlib.rectangle object.
- ensures
Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.
- dlib.threshold_filter_singular_values(detector: dlib.simple_object_detector, thresh: float) dlib.simple_object_detector ¶
- requires
thresh >= 0
- ensures
Removes all components of the filters in the given detector that have singular values that are smaller than the given threshold. Therefore, this function allows you to control how many separable filters are in a detector. In particular, as thresh gets larger the quantity num_separable_filters(threshold_filter_singular_values(detector,thresh)) will generally get smaller and therefore give a faster running detector. However, note that at some point a large enough thresh will drop too much information from the filters and their accuracy will suffer.
returns the updated detector
- dlib.threshold_image(*args, **kwargs)¶
Overloaded function.
threshold_image(img: numpy.ndarray[(rows,cols),uint8]) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols),uint16]) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols),uint32]) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols),float32]) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols),float64]) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols,3),uint8]) -> numpy.ndarray[(rows,cols),uint8]
Thresholds img and returns the result. Pixels in img with grayscale values >= partition_pixels(img) have an output value of 255 and all others have a value of 0.
threshold_image(img: numpy.ndarray[(rows,cols),uint8], thresh: int) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols),uint16], thresh: int) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols),uint32], thresh: int) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols),float32], thresh: float) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols),float64], thresh: float) -> numpy.ndarray[(rows,cols),uint8]
threshold_image(img: numpy.ndarray[(rows,cols,3),uint8], thresh: int) -> numpy.ndarray[(rows,cols),uint8]
Thresholds img and returns the result. Pixels in img with grayscale values >= thresh have an output value of 255 and all others have a value of 0.
- dlib.tile_images(images: list) array ¶
- requires
images is a list of numpy arrays that can be interpreted as images. They must all be the same type of image as well.
- ensures
This function takes the given images and tiles them into a single large square image and returns this new big tiled image. Therefore, it is a useful method to visualize many small images at once.
- dlib.train_sequence_segmenter(*args, **kwargs)¶
Overloaded function.
train_sequence_segmenter(samples: dlib.vectorss, segments: dlib.rangess, params: dlib.segmenter_params=<BIO,highFeats,signed,win=5,threads=4,eps=0.1,cache=40,non-verbose,C=100>) -> dlib.segmenter_type
train_sequence_segmenter(samples: dlib.sparse_vectorss, segments: dlib.rangess, params: dlib.segmenter_params=<BIO,highFeats,signed,win=5,threads=4,eps=0.1,cache=40,non-verbose,C=100>) -> dlib.segmenter_type
- dlib.train_shape_predictor(*args, **kwargs)¶
Overloaded function.
train_shape_predictor(images: list, object_detections: list, options: dlib.shape_predictor_training_options) -> dlib.shape_predictor
- requires
options.lambda_param > 0
0 < options.nu <= 1
options.feature_pool_region_padding >= 0
len(images) == len(object_detections)
images should be a list of numpy matrices that represent images, either RGB or grayscale.
object_detections should be a list of lists of dlib.full_object_detection objects. Each dlib.full_object_detection contains the bounding box and the lists of points that make up the object parts.
- ensures
Uses dlib’s shape_predictor_trainer object to train a shape_predictor based on the provided labeled images, full_object_detections, and options.
The trained shape_predictor is returned
train_shape_predictor(dataset_filename: str, predictor_output_filename: str, options: dlib.shape_predictor_training_options) -> None
- requires
options.lambda_param > 0
0 < options.nu <= 1
options.feature_pool_region_padding >= 0
- ensures
Uses dlib’s shape_predictor_trainer to train a shape_predictor based on the labeled images in the XML file dataset_filename and the provided options. This function assumes the file dataset_filename is in the XML format produced by dlib’s save_image_dataset_metadata() routine.
The trained shape predictor is serialized to the file predictor_output_filename.
- dlib.train_simple_object_detector(*args, **kwargs)¶
Overloaded function.
train_simple_object_detector(dataset_filename: str, detector_output_filename: str, options: dlib.simple_object_detector_training_options) -> None
- requires
options.C > 0
- ensures
Uses the structural_object_detection_trainer to train a simple_object_detector based on the labeled images in the XML file dataset_filename. This function assumes the file dataset_filename is in the XML format produced by dlib’s save_image_dataset_metadata() routine.
This function will apply a reasonable set of default parameters and preprocessing techniques to the training procedure for simple_object_detector objects. So the point of this function is to provide you with a very easy way to train a basic object detector.
The trained object detector is serialized to the file detector_output_filename.
train_simple_object_detector(images: list, boxes: list, options: dlib.simple_object_detector_training_options) -> dlib::simple_object_detector_py
- requires
options.C > 0
len(images) == len(boxes)
images should be a list of numpy matrices that represent images, either RGB or grayscale.
boxes should be a list of lists of dlib.rectangle object.
- ensures
Uses the structural_object_detection_trainer to train a simple_object_detector based on the labeled images and bounding boxes.
This function will apply a reasonable set of default parameters and preprocessing techniques to the training procedure for simple_object_detector objects. So the point of this function is to provide you with a very easy way to train a basic object detector.
The trained object detector is returned.
- dlib.transform_image(*args, **kwargs)¶
Overloaded function.
transform_image(img: numpy.ndarray[(rows,cols),uint8], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),uint8]
transform_image(img: numpy.ndarray[(rows,cols),uint16], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),uint16]
transform_image(img: numpy.ndarray[(rows,cols),uint32], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),uint32]
transform_image(img: numpy.ndarray[(rows,cols),uint64], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),uint64]
transform_image(img: numpy.ndarray[(rows,cols),int8], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),int8]
transform_image(img: numpy.ndarray[(rows,cols),int16], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),int16]
transform_image(img: numpy.ndarray[(rows,cols),int32], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),int32]
transform_image(img: numpy.ndarray[(rows,cols),int64], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),int64]
transform_image(img: numpy.ndarray[(rows,cols),float32], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),float32]
transform_image(img: numpy.ndarray[(rows,cols),float64], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols),float64]
transform_image(img: numpy.ndarray[(rows,cols,3),uint8], map_point: dlib.point_transform_projective, rows: int, columns: int) -> numpy.ndarray[(rows,cols,3),uint8]
- requires
rows > 0
columns > 0
- ensures
Returns an image that is the given rows by columns in size and contains a transformed part of img. To do this, we interpret map_point as a mapping from pixels in the returned image to pixels in the input img. transform_image() uses this mapping and bilinear interpolation to fill the output image with an interpolated copy of img.
Any locations in the output image that map to pixels outside img are set to 0.
- dlib.translate_rect(*args, **kwargs)¶
Overloaded function.
translate_rect(rect: dlib.rectangle, p: dlib.point) -> dlib.rectangle
- returns rectangle(rect.left()+p.x, rect.top()+p.y, rect.right()+p.x, rect.bottom()+p.y)
(i.e. moves the location of the rectangle but doesn’t change its shape)
translate_rect(rect: dlib.drectangle, p: dlib.point) -> dlib.drectangle
- returns rectangle(rect.left()+p.x, rect.top()+p.y, rect.right()+p.x, rect.bottom()+p.y)
(i.e. moves the location of the rectangle but doesn’t change its shape)
translate_rect(rect: dlib.rectangle, p: dlib.dpoint) -> dlib.rectangle
- returns rectangle(rect.left()+p.x, rect.top()+p.y, rect.right()+p.x, rect.bottom()+p.y)
(i.e. moves the location of the rectangle but doesn’t change its shape)
translate_rect(rect: dlib.drectangle, p: dlib.dpoint) -> dlib.drectangle
- returns rectangle(rect.left()+p.x, rect.top()+p.y, rect.right()+p.x, rect.bottom()+p.y)
(i.e. moves the location of the rectangle but doesn’t change its shape)
- class dlib.vector¶
This object represents the mathematical idea of a column vector.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.vector) -> None
__init__(self: dlib.vector, arg0: object) -> None
- resize(self: dlib.vector, arg0: int) None ¶
- set_size(self: dlib.vector, arg0: int) None ¶
- property shape¶
- class dlib.vectors¶
This object is an array of vector objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.vectors) -> None
__init__(self: dlib.vectors, arg0: dlib.vectors) -> None
Copy constructor
__init__(self: dlib.vectors, arg0: iterable) -> None
- append(self: dlib.vectors, x: dlib.vector) None ¶
Add an item to the end of the list
- clear(self: dlib.vectors) None ¶
- count(self: dlib.vectors, x: dlib.vector) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.vectors, L: dlib.vectors) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.vectors, arg0: list) -> None
- insert(self: dlib.vectors, i: int, x: dlib.vector) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.vectors) -> dlib.vector
Remove and return the last item
pop(self: dlib.vectors, i: int) -> dlib.vector
Remove and return the item at index
i
- remove(self: dlib.vectors, x: dlib.vector) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.vectors, arg0: int) None ¶
- class dlib.vectorss¶
This object is an array of arrays of vector objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.vectorss) -> None
__init__(self: dlib.vectorss, arg0: dlib.vectorss) -> None
Copy constructor
__init__(self: dlib.vectorss, arg0: iterable) -> None
- append(self: dlib.vectorss, x: dlib.vectors) None ¶
Add an item to the end of the list
- clear(self: dlib.vectorss) None ¶
- count(self: dlib.vectorss, x: dlib.vectors) int ¶
Return the number of times
x
appears in the list
- extend(*args, **kwargs)¶
Overloaded function.
extend(self: dlib.vectorss, L: dlib.vectorss) -> None
Extend the list by appending all the items in the given list
extend(self: dlib.vectorss, arg0: list) -> None
- insert(self: dlib.vectorss, i: int, x: dlib.vectors) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.vectorss) -> dlib.vectors
Remove and return the last item
pop(self: dlib.vectorss, i: int) -> dlib.vectors
Remove and return the item at index
i
- remove(self: dlib.vectorss, x: dlib.vectors) None ¶
Remove the first item from the list whose value is x. It is an error if there is no such item.
- resize(self: dlib.vectorss, arg0: int) None ¶
- dlib.zero_border_pixels(*args, **kwargs)¶
Overloaded function.
zero_border_pixels(img: numpy.ndarray[(rows,cols),uint8], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),uint16], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),uint32], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),uint64], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),int8], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),int16], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),int32], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),int64], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),float32], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),float64], x_border_size: int, y_border_size: int) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols,3),uint8], x_border_size: int, y_border_size: int) -> None
- requires
x_border_size >= 0
y_border_size >= 0
- ensures
The size and shape of img isn’t changed by this function.
- for all valid r such that r+y_border_size or r-y_border_size gives an invalid row
- for all valid c such that c+x_border_size or c-x_border_size gives an invalid column
assigns the pixel img[r][c] to 0. (i.e. assigns 0 to every pixel in the border of img)
zero_border_pixels(img: numpy.ndarray[(rows,cols),uint8], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),uint16], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),uint32], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),uint64], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),int8], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),int16], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),int32], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),int64], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),float32], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols),float64], inside: dlib.rectangle) -> None
zero_border_pixels(img: numpy.ndarray[(rows,cols,3),uint8], inside: dlib.rectangle) -> None
- ensures
The size and shape of img isn’t changed by this function.
All the pixels in img that are not contained inside the inside rectangle given to this function are set to 0. That is, anything not “inside” is on the border and set to 0.
Routines for setting CUDA specific properties.
- dlib.cuda.get_device() int ¶
Get the active CUDA device.
- dlib.cuda.get_num_devices() int ¶
Find out how many CUDA devices are available.
- dlib.cuda.set_device(device_id: int) None ¶
Set the active CUDA device. It is required that 0 <= device_id < get_num_devices().
Routines and objects for working with dlib’s image dataset metadata XML files.
- class dlib.image_dataset_metadata.box¶
This object represents an annotated rectangular area of an image. It is typically used to mark the location of an object such as a person, car, etc.
The main variable of interest is rect. It gives the location of the box. All the other variables are optional.
- FEMALE = gender_type.FEMALE¶
- MALE = gender_type.MALE¶
- UNKNOWN = gender_type.UNKNOWN¶
- __init__(self: dlib.image_dataset_metadata.box) None ¶
- property age¶
- property angle¶
- property detection_score¶
- property difficult¶
- property gender¶
- class gender_type¶
- FEMALE = gender_type.FEMALE¶
- MALE = gender_type.MALE¶
- UNKNOWN = gender_type.UNKNOWN¶
- __init__(self: dlib.image_dataset_metadata.box.gender_type, arg0: int) None ¶
- property ignore¶
- property label¶
- property occluded¶
- property parts¶
- property pose¶
- property rect¶
- property truncated¶
- class dlib.image_dataset_metadata.boxes¶
An array of dlib::image_dataset_metadata::box objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.image_dataset_metadata.boxes) -> None
__init__(self: dlib.image_dataset_metadata.boxes, arg0: dlib.image_dataset_metadata.boxes) -> None
Copy constructor
__init__(self: dlib.image_dataset_metadata.boxes, arg0: iterable) -> None
- append(self: dlib.image_dataset_metadata.boxes, x: dlib.image_dataset_metadata.box) None ¶
Add an item to the end of the list
- extend(self: dlib.image_dataset_metadata.boxes, L: dlib.image_dataset_metadata.boxes) None ¶
Extend the list by appending all the items in the given list
- insert(self: dlib.image_dataset_metadata.boxes, i: int, x: dlib.image_dataset_metadata.box) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.image_dataset_metadata.boxes) -> dlib.image_dataset_metadata.box
Remove and return the last item
pop(self: dlib.image_dataset_metadata.boxes, i: int) -> dlib.image_dataset_metadata.box
Remove and return the item at index
i
- class dlib.image_dataset_metadata.dataset¶
This object represents a labeled set of images. In particular, it contains the filename for each image as well as annotated boxes.
- __init__(self: dlib.image_dataset_metadata.dataset) None ¶
- property comment¶
- property images¶
- property name¶
- class dlib.image_dataset_metadata.image¶
This object represents an annotated image.
- __init__(self: dlib.image_dataset_metadata.image) None ¶
- property boxes¶
- property filename¶
- class dlib.image_dataset_metadata.images¶
An array of dlib::image_dataset_metadata::image objects.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.image_dataset_metadata.images) -> None
__init__(self: dlib.image_dataset_metadata.images, arg0: dlib.image_dataset_metadata.images) -> None
Copy constructor
__init__(self: dlib.image_dataset_metadata.images, arg0: iterable) -> None
- append(self: dlib.image_dataset_metadata.images, x: dlib.image_dataset_metadata.image) None ¶
Add an item to the end of the list
- extend(self: dlib.image_dataset_metadata.images, L: dlib.image_dataset_metadata.images) None ¶
Extend the list by appending all the items in the given list
- insert(self: dlib.image_dataset_metadata.images, i: int, x: dlib.image_dataset_metadata.image) None ¶
Insert an item at a given position.
- pop(*args, **kwargs)¶
Overloaded function.
pop(self: dlib.image_dataset_metadata.images) -> dlib.image_dataset_metadata.image
Remove and return the last item
pop(self: dlib.image_dataset_metadata.images, i: int) -> dlib.image_dataset_metadata.image
Remove and return the item at index
i
- dlib.image_dataset_metadata.load_image_dataset_metadata(filename: str) dlib.image_dataset_metadata.dataset ¶
Attempts to interpret filename as a file containing XML formatted data as produced by the save_image_dataset_metadata() function. The data is loaded and returned as a dlib.image_dataset_metadata.dataset object.
- class dlib.image_dataset_metadata.parts¶
This object is a dictionary mapping string names to object part locations.
- __init__(*args, **kwargs)¶
Overloaded function.
__init__(self: dlib.image_dataset_metadata.parts) -> None
__init__(self: dlib.image_dataset_metadata.parts, arg0: dict) -> None
- items(self: dlib.image_dataset_metadata.parts) iterator ¶
- dlib.image_dataset_metadata.save_image_dataset_metadata(data: dlib.image_dataset_metadata.dataset, filename: str) None ¶
Writes the contents of the meta object to a file with the given filename. The file will be in an XML format.