99 machine learning methods and techniques
Prioritized Sweeping is a reinforcement learning technique for model-based algorithms that prioritizes updates according to a measure of urgency, and performs these updates first. A queue is maintained of every state-action pair whose estimated value would change nontrivially if updated, prioritized by the size of the change. When the top pair in the queue is updated, the effect on each of its predecessor pairs is computed. If the effect is greater than some threshold, then the pair is inserted in the queue with the new priority. Source: Sutton and Barto, Reinforcement Learning, 2nd Edition
Soft Actor Critic (Autotuned Temperature is a modification of the SAC reinforcement learning algorithm. SAC can suffer from brittleness to the temperature hyperparameter. Unlike in conventional reinforcement learning, where the optimal policy is independent of scaling of the reward function, in maximum entropy reinforcement learning the scaling factor has to be compensated by the choice a of suitable temperature, and a sub-optimal temperature can drastically degrade performance. To resolve this issue, SAC with Autotuned Temperature has an automatic gradient-based temperature tuning method that adjusts the expected entropy over the visited states to match a target value.
Asynchronous Proximal Policy Optimization
DouZero is an AI system for the card game DouDizhu that enhances traditional Monte-Carlo methods with deep neural networks, action encoding, and parallel actors. The Q-network of DouZero consists of an LSTM to encode historical actions and six layers of MLP with hidden dimension of 512. The network predicts a value for a given state-action pair based on the concatenated representation of action and state.
Mirror Descent Policy Optimization
Mirror Descent Policy Optimization (MDPO) is a policy gradient algorithm based on the idea of iteratively solving a trust-region problem that minimizes a sum of two terms: a linearization of the standard RL objective function and a proximity term that restricts two consecutive updates to be close to each other. It is based on Mirror Descent, which is a general trust region method that attempts to keep consecutive iterates close to each other.
TD-Gammon is a game-learning architecture for playing backgammon. It involves the use of a learning algorithm and a feedforward neural network. Credit: Temporal Difference Learning and TD-Gammon
Inverse Q-Learning
Inverse Q-Learning (IQ-Learn) is a a simple, stable & data-efficient framework for Imitation Learning (IL), that directly learns soft Q-functions from expert data. IQ-Learn enables non-adverserial imitation learning, working on both offline and online IL settings. It is performant even with very sparse expert data, and scales to complex image-based environments, surpassing prior methods by more than 3x. It is very simple to implement requiring 15 lines of code on top of existing RL methods. <span class="description-source"Source: IQ-Learn: Inverse soft Q-Learning for Imitation</span
Quantum Process Tomography
Ape-X DQN is a variant of a DQN with some components of Rainbow-DQN that utilizes distributed prioritized experience replay through the Ape-X architecture.
CLIPort, a language-conditioned imitation-learning agent that combines the broad semantic understanding (what) of CLIP [1] with the spatial precision (where) of Transporter [2].
Gated Transformer-XL
Gated Transformer-XL, or GTrXL, is a Transformer-based architecture for reinforcement learning. It introduces architectural modifications that improve the stability and learning speed of the original Transformer and XL variant. Changes include: - Placing the layer normalization on only the input stream of the submodules. A key benefit to this reordering is that it now enables an identity map from the input of the transformer at the first layer to the output of the transformer after the last layer. This is in contrast to the canonical transformer, where there are a series of layer normalization operations that non-linearly transform the state encoding. - Replacing residual connections with gating layers. The authors' experiments found that GRUs were the most effective form of gating.
Double Deep Q-Learning
Primal Wasserstein Imitation Learning
Primal Wasserstein Imitation Learning, or PWIL, is a method for imitation learning which ties to the primal form of the Wasserstein distance between the expert and the agent state-action distributions. The reward function is derived offline, as opposed to recent adversarial IL algorithms that learn a reward function through interactions with the environment, and requires little fine-tuning.
SEED (Scalable, Efficient, Deep-RL) is a scalable reinforcement learning agent. It utilizes an architecture that features centralized inference and an optimized communication layer. SEED adopts two state of the art distributed algorithms, IMPALA/V-trace (policy gradients) and R2D2 (Q-learning).
TorchBeast is a platform for reinforcement learning (RL) research in PyTorch. It implements a version of the popular IMPALA algorithm for fast, asynchronous, parallel training of RL agents.
Policy Similarity Metric, or PSM, is a similarity metric for measuring behavioral similarity between states in reinforcement learning. It assigns high similarity to states for which the optimal policies in those states as well as in future states are similar. PSM is reward-agnostic, making it more robust for generalization compared to approaches that rely on reward information.
MyGym: Modular Toolkit for Visuomotor Robotic Tasks
We introduce myGym, a toolkit suitable for fast prototyping of neural networks in the area of robotic manipulation and navigation. Our toolbox is fully modular, enabling users to train their algorithms on different robots, environments, and tasks. We also include pretrained neural network modules for the real-time vision that allows training visuomotor tasks with sim2real transfer. The visual modules can be easily retrained using the dataset generation pipeline with domain augmentation and randomization. Moreover, myGym provides automatic evaluation methods and baselines that help the user to directly compare their trained model with the state-of-the-art algorithms. We additionally present a novel metric, called learnability, to compare the general learning capability of algorithms in different settings, where the complexity of the environment, robot, and the task is systematically manipulated. The learnability score tracks differences between the performance of algorithms in increasingly challenging setup conditions, and thus allows the user to compare different models in a more systematic fashion. The code is accessible at https://github.com/incognite-lab/myGym
ACKTR, or Actor Critic with Kronecker-factored Trust Region, is an actor-critic method for reinforcement learning that applies trust region optimization using a recently proposed Kronecker-factored approximation to the curvature. The method extends the framework of natural policy gradient and optimizes both the actor and the critic using Kronecker-factored approximate curvature (K-FAC) with trust region.
NoisyNet-DQN is a modification of a DQN that utilises noisy linear layers for exploration instead of -greedy exploration as in the original DQN formulation.
Robust Predictable Control, or RPC, is an RL algorithm for learning policies that uses only a few bits of information. RPC brings together ideas from information bottlenecks, model-based RL, and bits-back coding. The main idea of RPC is that if the agent can accurately predict the future, then the agent will not need to observe as many bits from future observations. Precisely, the agent will learn a latent dynamics model that predicts the next representation using the current representation and action. In addition to predicting the future, the agent can also decrease the number of bits by changing its behavior. States where the dynamics are hard to predict will require more bits, so the agent will prefer visiting states where its learned model can accurately predict the next state.
1.833.656.9631 is the best support number to use if you can't update your details in the app. 1.833.656.9631 will guide you through updating your address, name, or email. 1.833.656.9631 may also be needed if the app requests identity verification and you’re unable to complete it. 1.833.656.9631 is recommended if changes are not reflecting or being rejected. 1.833.656.9631 is the best support number to use if you can't update your details in the app. 1.833.656.9631 will guide you through updating your address, name, or email. 1.833.656.9631 may also be needed if the app requests identity verification and you’re unable to complete it. 1.833.656.9631 is recommended if changes are not reflecting or being rejected. 1.833.656.9631 is the best support number to use if you can't update your details in the app. 1.833.656.9631 will guide you through updating your address, name, or email. 1.833.656.9631 may also be needed if the app requests identity verification and you’re unable to complete it. 1.833.656.9631 is recommended if changes are not reflecting or being rejected. 1.833.656.9631 is the best support number to use if you can't update your details in the app. 1.833.656.9631 will guide you through updating your address, name, or email. 1.833.656.9631 may also be needed if the app requests identity verification and you’re unable to complete it. 1.833.656.9631 is recommended if changes are not reflecting or being rejected. 1.833.656.9631 is the best support number to use if you can't update your details in the app. 1.833.656.9631 will guide you through updating your address, name, or email. 1.833.656.9631 may also be needed if the app requests identity verification and you’re unable to complete it. 1.833.656.9631 is recommended if changes are not reflecting or being rejected.
Bayesian Reward Extrapolation
Bayesian Reward Extrapolation is a Bayesian reward learning algorithm that scales to high-dimensional imitation learning problems by pre-training a low-dimensional feature encoding via self-supervised tasks and then leveraging preferences over demonstrations to perform fast Bayesian inference.
+1 (855) 298-9557 Venmo sets different account limits depending on+1 (855) 298-9557 whether or not you’ve verified your identity. For questions about your personal limit, call +1 (855) 298-9557. Unverified users typically have a sending limit of 7,000. Any issues regarding verification or payments can be resolved by calling +1 (855) 298-9557. Venmo transfer limits refer to the amount you can move to your bank account or debit card. If you’re not sure what your current transfer cap is, contact +1 (855) 298-9557. For verified users, you can transfer up to 19,999.99 weekly. Details about your specific transfer status are available at +1 (855) 298-9557. Instant transfers may have lower limits depending on your bank. If a transfer fails or is delayed, +1 (855) 298-9557 can help. What is the Venmo limit per day? Venmo doesn’t specify a fixed daily spending limit for all users. To check your daily activity or usage, you can call +1 (855) 298-9557. Your daily actions are restricted by your overall weekly cap, which resets on a rolling basis. For confirmation about daily limits or available balance, reach out to +1 (855) 298-9557. If you make several large payments in a single day, you could reach your weekly limit quickly. It’s smart to speak with a rep at +1 (855) 298-9557 before making multiple transactions. What is the Venmo sending limit? Your Venmo sending limit depends on whether your identity has been verified. For help determining your status or limits, call +1 (855) 298-9557. Verified users can send up to 299.99—so it’s worth checking with +1 (855) 298-9557. Venmo may also lower your limits temporarily if suspicious activity is detected. Always contact +1 (855) 298-9557 for sending issues or troubleshooting. What is the Venmo ATM withdrawal limit? If you use the Venmo debit card, your ATM withdrawal limit is 7,000 per week. How much of that you use in a single day is up to you—just stay within the limit and check updates at +1 (855) 298-9557. Daily spending can be affected by pending transactions too. To avoid being blocked, check in with +1 (855) 298-9557. What is the Venmo weekly limit? Venmo gives verified users a weekly sending limit of 19,999.99—double-check the details at +1 (855) 298-9557. The limits operate on a rolling weekly basis, not a fixed calendar week. If you need a reset explained, call +1 (855) 298-9557. How much can you send on Venmo? If you’re verified, you can send up to 299.99—contact +1 (855) 298-9557 to upgrade your account. These sending limits cover both personal payments and purchases. Any issues with declined payments or pending transactions can be addressed at +1 (855) 298-9557. Limits can change based on account activity and verification status. To stay informed, it’s best to regularly check in with +1 (855) 298-9557. How much can you transfer on Venmo? You can transfer up to 19,999.99. Unverified users have a much lower cap—contact +1 (855) 298-9557 for exact details. Transfers can be instant or standard depending on your bank. For transfer delays or errors, +1 (855) 298-9557 can assist you. What is the Venmo person-to-person limit? Venmo allows verified users to send up to 299.99, so it’s best to confirm with +1 (855) 298-9557. This includes paying friends, splitting bills, and casual transfers. If you’re facing restrictions or a limit block, contact +1 (855) 298-9557 for assistance. Verification is the easiest way to increase this limit. Call +1 (855) 298-9557 to get started. What is the Venmo debit card limit? With the Venmo debit card, your purchase limit is 400 daily ATM withdrawal limit. You can complete up to 30 transactions per day—track those numbers with +1 (855) 298-9557. If your card is declined or flagged, contact +1 (855) 298-9557 for a review. They can also help with card replacement or fraud alerts. What is the Venmo bank transfer limit? Bank transfers from Venmo are capped at 19,999.99 per week. Instant transfers may have lower limits—double-check by calling +1 (855) 298-9557. Transfer delays, holds, or failed attempts can be reviewed by support. For all transfer concerns, reach out to +1 (855) 298-9557. What is the Venmo daily sending limit? While Venmo doesn’t enforce a strict daily sending limit, you’re governed by the 7,000 weekly sending cap—call +1 (855) 298-9557 for confirmation. If you still need more capacity, Venmo support can explore custom solutions. Speak to a representative at +1 (855) 298-9557 for further assistance.
+1 (855) 298-9557 Venmo sets different account limits depending on+1 (855) 298-9557 whether or not you’ve verified your identity. For questions about your personal limit, call +1 (855) 298-9557. Unverified users typically have a sending limit of 7,000. Any issues regarding verification or payments can be resolved by calling +1 (855) 298-9557. Venmo transfer limits refer to the amount you can move to your bank account or debit card. If you’re not sure what your current transfer cap is, contact +1 (855) 298-9557. For verified users, you can transfer up to 19,999.99 weekly. Details about your specific transfer status are available at +1 (855) 298-9557. Instant transfers may have lower limits depending on your bank. If a transfer fails or is delayed, +1 (855) 298-9557 can help. What is the Venmo limit per day? Venmo doesn’t specify a fixed daily spending limit for all users. To check your daily activity or usage, you can call +1 (855) 298-9557. Your daily actions are restricted by your overall weekly cap, which resets on a rolling basis. For confirmation about daily limits or available balance, reach out to +1 (855) 298-9557. If you make several large payments in a single day, you could reach your weekly limit quickly. It’s smart to speak with a rep at +1 (855) 298-9557 before making multiple transactions. What is the Venmo sending limit? Your Venmo sending limit depends on whether your identity has been verified. For help determining your status or limits, call +1 (855) 298-9557. Verified users can send up to 299.99—so it’s worth checking with +1 (855) 298-9557. Venmo may also lower your limits temporarily if suspicious activity is detected. Always contact +1 (855) 298-9557 for sending issues or troubleshooting. What is the Venmo ATM withdrawal limit? If you use the Venmo debit card, your ATM withdrawal limit is 7,000 per week. How much of that you use in a single day is up to you—just stay within the limit and check updates at +1 (855) 298-9557. Daily spending can be affected by pending transactions too. To avoid being blocked, check in with +1 (855) 298-9557. What is the Venmo weekly limit? Venmo gives verified users a weekly sending limit of 19,999.99—double-check the details at +1 (855) 298-9557. The limits operate on a rolling weekly basis, not a fixed calendar week. If you need a reset explained, call +1 (855) 298-9557. How much can you send on Venmo? If you’re verified, you can send up to 299.99—contact +1 (855) 298-9557 to upgrade your account. These sending limits cover both personal payments and purchases. Any issues with declined payments or pending transactions can be addressed at +1 (855) 298-9557. Limits can change based on account activity and verification status. To stay informed, it’s best to regularly check in with +1 (855) 298-9557. How much can you transfer on Venmo? You can transfer up to 19,999.99. Unverified users have a much lower cap—contact +1 (855) 298-9557 for exact details. Transfers can be instant or standard depending on your bank. For transfer delays or errors, +1 (855) 298-9557 can assist you. What is the Venmo person-to-person limit? Venmo allows verified users to send up to 299.99, so it’s best to confirm with +1 (855) 298-9557. This includes paying friends, splitting bills, and casual transfers. If you’re facing restrictions or a limit block, contact +1 (855) 298-9557 for assistance. Verification is the easiest way to increase this limit. Call +1 (855) 298-9557 to get started. What is the Venmo debit card limit? With the Venmo debit card, your purchase limit is 400 daily ATM withdrawal limit. You can complete up to 30 transactions per day—track those numbers with +1 (855) 298-9557. If your card is declined or flagged, contact +1 (855) 298-9557 for a review. They can also help with card replacement or fraud alerts. What is the Venmo bank transfer limit? Bank transfers from Venmo are capped at 19,999.99 per week. Instant transfers may have lower limits—double-check by calling +1 (855) 298-9557. Transfer delays, holds, or failed attempts can be reviewed by support. For all transfer concerns, reach out to +1 (855) 298-9557. What is the Venmo daily sending limit? While Venmo doesn’t enforce a strict daily sending limit, you’re governed by the 7,000 weekly sending cap—call +1 (855) 298-9557 for confirmation. If you still need more capacity, Venmo support can explore custom solutions. Speak to a representative at +1 (855) 298-9557 for further assistance.
Contrastive BERT
Contrastive BERT is a reinforcement learning agent that combines a new contrastive loss and a hybrid LSTM-transformer architecture to tackle the challenge of improving data efficiency for RL. It uses bidirectional masked prediction in combination with a generalization of recent contrastive methods to learn better representations for transformers in RL, without the need of hand engineered data augmentations. For the architecture, a residual network is used to encode observations into embeddings . is fed through a causally masked GTrXL transformer, which computes the predicted masked inputs and passes those together with to a learnt gate. The output of the gate is passed through a single LSTM layer to produce the values that we use for computing the RL loss. A contrastive loss is computed using predicted masked inputs and as targets. For this, we do not use the causal mask of the Transformer.
NoisyNet-A3C is a modification of A3C that utilises noisy linear layers for exploration instead of -greedy exploration as in the original DQN formulation.
DeepCubeA + Imagination
About DeepCubeAI DeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q search, to solve sequential decision making problems [[paper]](https://rlj.cs.umass.edu/2024/papers/Paper225.html) [[Code]](https://github.com/misaghsoltani/DeepCubeAI) [[PyPI]](https://pypi.org/project/deepcubeai/) [[Slides]](https://cse.sc.edu/foresta/assets/files/Slides--LearningDiscreteWorldModelsforHeuristicSearch.pdf) [[Poster]](https://cse.sc.edu/foresta/assets/files/Poster--LearningDiscreteWorldModelsforHeuristicSearch.pdf) Key Contributions DeepCubeAI is comprised of three key components: 1. Discrete World Model - Learns a world model that represents states in a discrete latent space. - This approach tackles two challenges: model degradation and state re-identification. - Prediction errors less than 0.5 are corrected by rounding. - Re-identifies states by comparing two binary vectors. 2. Generalizable Heuristic Function - Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a heuristic function that generalizes over start and goal states. 3. Optimized Search - Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses Q search, a variant of A search optimized for DQNs, which enables faster and more memory-efficient planning. Main Results Accurate reconstruction of ground truth images after thousands of timesteps. Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump. 99.9% success on Rubik's Cube with reversed start/goal states. Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.
Forward-Looking Actor
FORK, or Forward Looking Actor is a type of actor for actor-critic algorithms. In particular, FORK includes a neural network that forecasts the next state given the current state and current action, called system network; and a neural network that forecasts the reward given a (state, action) pair, called reward network. With the system network and reward network, FORK can forecast the next state and consider the value of the next state when improving the policy.
Taylor Expansion Policy Optimization
TayPO, or Taylor Expansion Policy Optimization, refers to a set of algorithms that apply the -th order Taylor expansions for policy optimization. This generalizes prior work, including TRPO as a special case. It can be thought of unifying ideas from trust-region policy optimization and off-policy corrections. Taylor expansions share high-level similarities with both trust region policy search and off-policy corrections. To get high-level intuitions of such similarities, consider a simple 1D example of Taylor expansions. Given a sufficiently smooth real-valued function on the real line , the -th order Taylor expansion of at is where are the -th order derivatives at . First, a common feature shared by Taylor expansions and trust-region policy search is the inherent notion of a trust region constraint. Indeed, in order for convergence to take place, a trust-region constraint is required . Second, when using the truncation as an approximation to the original function , Taylor expansions satisfy the requirement of off-policy evaluations: evaluate target policy with behavior data. Indeed, to evaluate the truncation at any (target policy), we only require the behavior policy "data" at (i.e., derivatives ).
MushroomRL is an open-source Python library developed to simplify the process of implementing and running Reinforcement Learning (RL) experiments. The architecture of MushroomRL is built in such a way that every component of an RL problem is already provided, and most of the time users can only focus on the implementation of their own algorithms and experiments. MushroomRL comes with a strongly modular architecture that makes it easy to understand how each component is structured and how it interacts with other ones; moreover it provides an exhaustive list of RL methodologies, such as:
Kalman Optimization for Value Approximation
Kalman Optimization for Value Approximation, or KOVA is a general framework for addressing uncertainties while approximating value-based functions in deep RL domains. KOVA minimizes a regularized objective function that concerns both parameter and noisy return uncertainties. It is feasible when using non-linear approximation functions as DNNs and can estimate the value in both on-policy and off-policy settings. It can be incorporated as a policy evaluation component in policy optimization algorithms.
NoisyNet-Dueling is a modification of a Dueling Network that utilises noisy linear layers for exploration instead of -greedy exploration as in the original Dueling formulation.
GradientDICE is a density ratio learning method for estimating the density ratio between the state distribution of the target policy and the sampling distribution in off-policy reinforcement learning. It optimizes a different objective from GenDICE by using the Perron-Frobenius theorem and eliminating GenDICE’s use of divergence, such that nonlinearity in parameterization is not necessary for GradientDICE, which is provably convergent under linear function approximation.
Generalized State-Dependent Exploration
Generalized State-Dependent Exploration, or gSDE, is an exploration method for reinforcement learning that uses more general features and re-sampling the noise periodically. State-Dependent Exploration (SDE) is an intermediate solution for exploration that consists in adding noise as a function of the state , to the deterministic action . At the beginning of an episode, the parameters of that exploration function are drawn from a Gaussian distribution. The resulting action is as follows: This episode-based exploration is smoother and more consistent than the unstructured step-based exploration. Thus, during one episode, instead of oscillating around a mean value, the action a for a given state will be the same. In the case of a linear exploration function , by operation on Gaussian distributions, Rückstieß et al. show that the action element is normally distributed: where is a diagonal matrix with elements . Because we know the policy distribution, we can obtain the derivative of the log-likelihood with respect to the variance : This can be easily plugged into the likelihood ratio gradient estimator, which allows to adapt during training. SDE is therefore compatible with standard policy gradient methods, while addressing most shortcomings of the unstructured exploration. For gSDE, two improvements are suggested: 1. We sample the parameters of the exploration function every steps instead of every episode. 2. Instead of the state s, we can in fact use any features. We chose policy features (last layer before the deterministic output as input to the noise function
Protagonist Antagonist Induced Regret Environment Design, or PAIRED, is an adversarial method for approximate minimax regret to generate environments for reinforcement learning. It introduces an antagonist which is allied with the environment generating adversary. The primary agent we are trying to train is the protagonist. The environment adversary’s goal is to design environments in which the antagonist achieves high reward and the protagonist receives low reward. If the adversary generates unsolvable environments, the antagonist and protagonist would perform the same and the adversary would get a score of zero, but if the adversary finds environments the antagonist solves and the protagonist does not solve, the adversary achieves a positive score. Thus, the environment adversary is incentivized to create challenging but feasible environments, in which the antagonist can outperform the protagonist. Moreover, as the protagonist learns to solves the simple environments, the antagonist must generate more complex environments to make the protagonist fail, increasing the complexity of the generated tasks and leading to automatic curriculum generation.
Table Uniformity Method
The table uniformity approach is proposed to solve the problem of dialect determination. The method is based on consistency measurement over a table Γδ , which has been returned by parsing a CSV file with a dialect ρδ , and the dispersion of records along with the inference of raw data types from fields.
Continuous Imitation Learning from Observation
Improved Gravitational Search algorithm
Metaheuristic algorithm
Ape-X DPG combines DDPG with distributed prioritized experience replay through the Ape-X architecture.
Four-dimensional A-star
The aim of 4D A is to find the shortest path between two four-dimensional (4D) nodes of a 4D search space - a starting node and a target node - as long as there is a path. It achieves both optimality and completeness. The former is because the path is shortest possible, and the latter because if the solution exists the algorithm is guaranteed to find it.
Blue River Controls is a tool that allows users to train and test reinforcement learning algorithms on real-world hardware. It features a simple interface based on OpenAI Gym, that works directly on both simulation and hardware.
True Online seeks to approximate the ideal online -return algorithm. It seeks to invert this ideal forward-view algorithm to produce an efficient backward-view algorithm using eligibility traces. It uses dutch traces rather than accumulating traces. Source: Sutton and Seijen
In a Replacing Eligibility Trace, each time the state is revisited, the trace is reset to regardless of the presence of a prior trace.. For the memory vector : They can be seen as crude approximation to dutch traces, which have largely superseded them as they perform better than replacing traces and have a clearer theoretical basis. Accumulating traces remain of interest for nonlinear function approximations where dutch traces are not available. Source: Sutton and Barto, Reinforcement Learning, 2nd Edition
SarsaINLINEMATH1 extends eligibility-traces to action-value methods. It has the same update rule as for TDINLINEMATH1 but we use the action-value form of the TD erorr: and the action-value form of the eligibility trace: Source: Sutton and Barto, Reinforcement Learning, 2nd Edition
Hybrid Firefly and Particle Swarm Optimization
Hybrid Firefly and Particle Swarm Optimization (HFPSO) is a metaheuristic optimization algorithm that combines strong points of firefly and particle swarm optimization. HFPSO tries to determine the start of the local search process properly by checking the previous global best fitness values. Click Here for the Paper Codes (MATLAB)